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ABSTRACT 

The mission of the National Center for Improving 
Science Education, a partnership between the NETWORK, Inc., and the 
Biological Sciences Curriculum Study (BSCS), is to promote changes in 
science curricula, science teaching, and assessment of student 
learning in science. The center analyzes and makes recommendations 
for policy and practice at the national, state, and local levels. As 
part of this task, the center synthesizes and translates the 
findings, recommendations, and viewpoints expressed in research 
studies and develops practical resources for policymakers and 
practitioners. This document is part of a second set of reports that 
focus on science and mathematics education for young ado?esc:r.ts. 
Included are chapters entitled* "(1) "Assessments The Middle Years'*; 
(2) "The Opportunity" ; (3) "Goals for Science Education and the 
Assessment Challenge"; (4) "The Context of Science Education in the 
Middle Years"; (5) "Assessment in Middle-Level Sciences Improving 
Current Practice"; (6) "Innovative Assessments: New Directions"; (7) 
"Assessments and Policy"; and (8) "Recommendations." Appended are the 
references, a listing of assessment panelists, and an index. (KR) 
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National Center for Improving Science Education 



The mission of the National Center for Improving Science Education is to promote 
changes in state and local policies and practices in science curriculum, science 
teaching, and assessment of student learning in science. To do so, the Center syn- 
thesizes and translates the findings, recommendations, and perspectives embodied 
in recent and forthcoming studies and reports, and develops practical resources for 
policymakers and practitioners. Bridging the gap between research, practice, and 
policy, the Center's work is intended to promote cooperation and collaboration 
among organizations, institutions, and individuals committed to the improvement 
of science education. 

The Center, a partnership between The NETWORK, Inc., of Andover, Massa- 
chusetts and Washington, QC, and the BSCS (Biological Sciences Curriculum Study) 
of Colorado Springs, Colorado, is funded by the US. Department of Education's Of- 
fice of Educational Research and Improvement. For further information on the 
Center's work, please contact The NETWORK, Inc.; 300 Brickstone Square, Suite 
900; Andover, Massachusetts 01810. 

lb order copies of the Center's reports for the elementary and middle years, or 
the Center's integrative reports, please contact the Publications Department, The 
NETWORK, Inc.; 300 Brickstone Square, Suite 900; Andover, Massachusetts 01810. 
Bulk order discounts are available. 

This report is based on work sponsored by the Office of Educational Reserch and 
Improvement (OERI), U.S. Department of Education, under grant number 
R168B80001. The content of this report does not necessarily reflect the views of the 
OERI, the Department, or any other agency of the VS. Government. 
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The mission of the National Center for Improving Science Education is to promote 
changes in science curricula, science teaching, and assessment of student learn- 
ing in science. The Center analyzes and makes recommendations for policy and 
practice at the national, state, and local level* As part of this task, the Center syn- 
thesizes and translates the findings, recommendations, and viewpoints expressed 
in recent and forthcoming studies and develops practical resources for policy makers 
and practitioners. The Center's work bridges the gap between research, practice, 
and policy, ami it promotes cooperation and collaboration among organizations, 
institutions, and individuals committed to improving science education. Hits report 
is one in a series. The first set of five reports, released between mid-1989 and 
mid-1990, focused on science education in the elementary years: 

• Science and Technology Education for the Elementary Years: Frameworks for 
Curriculum and Instruction 

• Assessment in Elementary School Science Education 

• Developing and Supporting Tfeachers for Elementary School Science Education 

• Getting Started in Science: A Blueprint for Elementary School Science 
Education 

• Elementary School Science for the 90s: A Guide to Action 

The first three reports focus on curriculum and instruction, assessment, and teacher 
development and support. The fourth report is a summary of the findings and recom- 
mendations documented in the first three The Act ton Guide is a practical tool that 
science supervisors can use to carry out the Center's recommendations. This docu- 
ment. Assessment in Science Education: The Middle Vfears, is part of a second set 
of reports that focus on science and mathematics education for young adolescents, 
The other reports in this second series include: 

• Science and Technology Education for the Middle tears: Frameworks for Cur- 
riculum and Instruction 

• Developing and Supporting Teachers for Science Education in the Middle tear, 

• Building Scientific Literacy: A Blueprint for Science in the Middle Years 

• Science for the Middle tears: A Guide to Action 
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The synthesis and recommendations in this repent were formulated with the help 
of the panel whose members are listed on page 113. We gratefully acknowledge the 
help of the many people who have supplied materials and made recommendations 
and suggestions for the text of the report. While the list would be to* long to 
acknowledge each contributor, we wish to give special thanks to Sally Crissman 
of Shady Hil! School . who provided several of the assessment anecdotes for chapter 
5. We also thank Elizabeth Stage of the California Science Project, University of 
California and William Cooley, University of Pittsburgh— their reviews did much 
to help improve this report. Thanks are also due to the support of the Center's 
monitor at the U.S. Department of Education, Wanda Chambers. 

The Center, a partnership between The NETWORK. Inc. of Andover, Massa- 
chusetts, and the Biological Sciences Curriculum Study (BSCS) of Colorado Springs, 
is funded by the US. Department of Education Office of Educational Research and 
Improvement. For copies of this report or further informal ton on the Center's work, 
please contact The NETWORK, Inc.; 300 Brickstone Square, Suite 900; Andover, 
Massachusetts 01810. 
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Chapter I 

Assessment The Middle Years 



In this report, the Center addresses the assessment of early adolescents' science 
teaming. The Center defines early adolescence as ages ten through fourteen. As 
we point out in chapter 4, school arrangements for this age group vary tremendously. 
Students in this age group might be attending an elementary school, a middle school, 
or a junior high school; each of which can span a variety of grades— even a K through 
twelfth grade school One teacher or a team of teachers working together might pro- 
vide instruction; or, individual teachers responsible for a specific subject— as in high 
school— might provide instruction. Often, administrative needs and traditions 
govern school organization and instruction within a district. 

In this report, we address science education and assessment for all early 
adolescents, no matter what kind of school they attend. Because each of the terms 
middle school, junior high school, and middle grades carries organizational and 
instructional connotations, we use the more neutral middle level (ox middle years) 
and early (or young) adolescents when we discuss science education for students 
in the ten-through fourteen-year-old age group. When discussing specific types of 
schools, we indicate which grades or age groups are appropriate. 

Obviously, anyone e. iged in an effort to help schools do a better job in science 
education for early adolescents must focus on two issues: (1) improving the science 
curriculum and science instruction, and (2) improving the quality of teaching and 
the competence of science teachers. But why worry about assessment? Six reasons 
are readily apparent, and, although the reasons for the classroom teacher will dif- 
fer somewhat from those of the policymaker, together these reasons provide a strong 
case for improving current assessment practices. The teacher should use assess- 
ment for the following reasons: 

• Assessments help guide instruction and make it more effective. Assessment 
should be used to establish what students bring to the classroom and what they 
are learning as instruction and classroom activities proceed. 

• Assessments impress on the students, school staff, and parents the importance 
of science education and the expectations for science learning at the middle 
level. 

• Assessments document accurately and comprehensively each student's prog- 
ress at (he end of an extended period of injunction— a semester or school year, 
or when a student moves on to a new classroom. 
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The policymaker should use assessments for three purposes: 

• *R> monitor the outcomes of science instruction, and, in particular, the students' 
achievements and competencies in. science, 

• To provide, when combined with other information, the base for formulating 
approaches that might improve science education, 

• To provide guidance on how resources invested in science education might be 
augmented or used differently. 

Through these means, assessment can—and does— exert a powerful influence on 
science education, an influence that has grown as mandates for assessment have 
grown. Whether this influence is for good or ill, however, depends on how tests and 
other forms of assessment are constructed and their results used The goals of science 
education, curricula and instructional techniques that reflect these goals, and the 
tests and other means of assessment used to establish what the students have 
learned and can do in science. Otherwise, assessments will distort the goals, the 
curriculum, and what the teacher chooses to do in the classroom. This is as true 
for assessments controlled and conducted by teachers for their own purposes as 
it is for externally mandated assessments intended for policy uses. Moreover, it does 
little good to improve teachers' assessment practices without making consonant 
improvements in large-scale assessments, so that both will reflect the kind of science 
education that advances the intellectual development and interests of young 
adolescents. 

In the Center s report on assessment in elementary school science education 
(Raken et al., 1989), we focused mainly on how assessment can serve instruction, 
that is, how teachers might enlarge and improve their assessment strategies by 
monitoring not only their students' progress in science, but the effectiveness of their 
own science instruction. But externally mandated, large-scale assessments con- 
ducted for policy purposes also must encourage and be consonant with good science 
teaching. Therefore, in the previous assessment report, we attended to this type of 
assessment as well and explored the inherent difficulties in tests administered to 
large numbers of students. 

In the current report, we maintain our emphasis on improving teachers' assess- 
ment practices and have limited our treatment of large-scale assessments, because 
the tests ami assessments teachers carry out in the classroom have more direct con- 
sequences for individual students— for their learning and their future engagement 
with science— than do district, state, or national assessments. Also, teachers have 
available an array of assessment strategies that can deeply probe the students' prog- 
ress ami link with more relevance to instructional practices in particular classrooms. 
These useful strategies are only just beginning to be introduced into large-scale 
assessments. Nevertheless, the lack of correspondence between many large-scale 
assessments and good science education for early adolescents continues to be a 
troubling concern. 

In this report we first review the capabilities and interests of early adolescents 
and consider the nature of an education, especially in science and technology, that 
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can build on those capabilities and interests. In chapter 2, we discuss what is known 
about the cognitive and social development of ten to fourteen-year-old students. Our 
coverage includes, but is not limited ta 

• the students increasing potential for engaging in the kind of thinking that 
characterizes science, 

• the student's developing reasoning skills, and 

• the glowing student interest in evaluating themselves and others, 

• the students' continuing need for concrete experiences, even as they are helped 
to develop formalized abstract thinking patterns. 

In chapter 3. the Center presents goal statements it has developed in the compa- 
nion curriculum and instruction report (Bybee et at., 1990). The goals reflect the 
grooving capabilities of early adolescents to deal with science content and methods, 
and they reflect widely made recomrnendations for the education of students in this 
age group. The goals address not only science content, but also the relationships 
between science and technology and the relationships between science, technology, 
society, and individuals— a particularly motivating subject for this age group. Also 
discussed in chapter 3 are the assessment challenges these goals pose and the 
knowledge, skills, and dispositions the students should acquire. 

The dilemmas oi assessment at the middle level are quite like those encountered 
in the elementary grades-especially those concerning the need to assess think- 
ing skills as opposed to assessing knowledge of subject matter. Assessment at the 
middle level encounters significant complications, however. Early adolescents ex- 
hibit thinking skills that are more complex than those exhibited by elementary 
students, and their knowledge base is larger. Also, if one educational goal is to 
develop critical-thinking and problem-solving capabilities for a variety of situations, 
assessment of transfer must be addressed. Furthermore, if leaming-to-Jeam and self- 
assessment skills are educational goals for the adolescent years, strategies for their 
assessment must be developed and included in assessment batteries. 

Because assessment must be set in the schooling context, we next review, in 
chapter 4. the nature of science programs and their broad middle-level school en- 
vironment We first consider current recommendations for the education of early 
adolescents in general and science education in particular. Vk then contrast and 
compare those recommendations with actual practice in today's middle-level 
schools 

Chapters 5 and fi are the core of this report. In »hese chapters, we detail our con- 
ception of assessment and instruction in the service of science learning. In chapter 
5. we point out the opportunities presented by the growing cognitive abilities of early 
adolescents These opportunities include assessments that inform teachers of their 
students' progress in science and help teachers guide the course of instruction. We 
recognize the continuing, although limited, utility of well-designed traditional tests, 
but we also attempt to broaden the definition of what counts for assessment. Several 
examples illustrate how scientific inquiry itself can serve to provide assessment 
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opportunities and how teachers can weave assessment into their science teaching. 
In chapter 5, we stress approaches possible now in any good middle-level science 
classroom. In chapter 6, we consider future directions in assessment and discuss 
several approaches still in experimental or trial stages. Several of these approaches 
come from state initiatives for assessing science or mathematics teaming: thus, 
chapter 6 builds a bridge to chapter 7, in which we take up assessment for >>o!icy 
purposes. A key point of chapter 7 is that decision makers who wish to improve 
science education need information not only about the full range of learning out- 
comes, but also about the context in which these outcomes are being achieved— 
types of students, characteristics of programs and instruction, and types of teachers 
and teaching conditions. 

In summary, what does good assessment of middle-level science education look 
like? 

• Assessments should match exemplary instruction. Assessment exercises 
should be indistinguishable from good instructional tasks and will often be in- 
terwoven with them. 

• Exercises should include hands-on performance tasks that allow students to 
demonstrate their proficiency in laboratory activities and scientific thinking. 

• Assessments should probe the student's depth of understanding as well as 
knowledge of subject matter 

• The emphasis should be on both the approach and the product: how an answer 
was obtained, how the student carried out a hands-on activity or conducted 
an investigation, and the student's final result. 

• Some assessments should be built around a student's research or design proj- 
ect, free from the time constraints usually imposed by tests and assessments, 
Opportunities should be provided for self-assessment and course correction 
as the students proceed through the protect, so that the teacher can check 
whether the student's proficiency in these important management skills has 
grown. Such protects also would allow judgments on competence in writing, 
presentation of data, the use of mathematics, and— if appropriate and 
available— use of the computer. 

• The notion of "product" or "performance" must be enlarged to include not just 
written reports about experiments and answers to test questions, but also 
speeches, models, drawings, group presentations, and displays. 

• There should be opportunities for group work designed around tasks too com- 
plex for students to undertake individually. In addition to providing informa- 
tion on the student's science learning and performance, this would allow the 
teacher to judge how effectively the individual communicates and contributes 
to the group, that is, the student's ability to collaborate effectively. 
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• If policymakers wish to use assessments for making improvements in science 
education, then they must take care to collect information not only on student 
outcomes, but also on the schooling context and science programs that pro- 
duce the student outcomes. In chapter 8, we conclude this report with recom- 
mendations for actions that will bring about assessments consonant with the 
goal of providing effective science education for all young adolescents. 



15 

Chapter! 5 




0 

ERIC 



T6 

BEST SOPY AWLE 



Chapter II 

The Opportunity 



During early adolescence humans develop the capacity to think in a way qualitative- 
ly different from the thinking typical of students in the early elementary grades. A 
Jaige proportion of youths, however, do not realize this capacity. Studies done in 
the United Kingdom with students in comprehensive secondary schools show that 



skills commonly used in science— although the percentage ranged from 60 to 85 
percent in elite secondary schools (Shayer and Adey, 1981). Renner et al. (1976) came 
to similar conclusions about twelfth-grade students in the United States. Failure to 
develop these higher order thinking skills places limits on an individual's contribu- 
tions to society and potential for personal development. Formal education consists 
of structured experiences and opportunities to reflect on these experiences. Formal 
education is critical to the realization of the capacity for reasoning and higher order 
cognition. Also, science, as an important component of forma! education at the mid- 
dle level, can directly support the development of formal operational thinking. But 
to do so, science education must be designed to take advantage of the early adoles- 
cent's cognitive and social development. 

The Adolescent and Adolescence: 

Perspectives 

The magnitude of the physiological, cognitive, and social changes that take place 
in early adolescence, the years from ten through fourteen, is second only to the 
magnitude of changes that take place in the first eighteen months after birth. The 
rate of physical growth accelerates, the secondary sexual characteristics develop, 
and the physiology of the brain changes. 

Two of the socio-psychological factors that are distinctly different in these two 
periods of the human life cycle are the degree of awareness on the part of the 
young person that physical and psychological changes are taking place, and the 
distribution of control between the youth and responsible adults. Unlike the 
eighteen-month-old child, who must be constantly reminded by adult caretakers 
that it is only so big, adolescents are acutely aware of their rapid growth and the 
appearance of secondary sexual characteristics. Furthermore, adolescents are em- 
barrassed that others— adults or peers— also have noticed these changes. In 
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addition to the more noticeable physical changes, earfy adotescsnis are also develop- 
ing an awareness of their thinking processes and the powerful new strategies 
available to them. 

The locus of control is also very different in these two periods of development 
While the infant exerts independence as it enters early childhood, the adults in the 
infant's life have the edge in size, reasoning capacity, and control over resources. 
The adolescent, in contrast, is becoming physically competitive with adults. The 
adolescent is developing a capacity to reason as an aduit, and biological forces com- 
pel the adolesceM to be independent, although they nevertheless realize that adults 
are still very much in control. 

Majority culture in the United States views adolescence as a traumatic unplea- 
sant period in life through which young adults must be shepherded as quickly as 
possible. Adults tend to assume that adolescents find this period in their lives as 
painful as those around them do, although evidence from research and interviews 
with adolescents contradicts this assumption (Committee on Adole >cence, Group 
for the Advancement of Psychiatry, 1968; Offer et al., 1981). Whether the "trauma" 
of adolescence is inevitable and universal or an artifact of particular cultures is an 
empirical question for which no firm evidence exists. Our stance on this matter- 
based more on philosophy than science— is that the extent of the "trauma" can be 
reduced considerably if society provides more support for youth in this period. What 
Is unclear is the kind of support that is best. Adults fear for the safety of adolescents 
who tend to look outside the home for values and models on which to pattern their 
behavior. The typical adolescent engages in high-risk behaviors, some as dangerous 
as using illegal drugs and alcohol, experimenting with sex, or operating motor 
vehicles irresponsibly. Adolescents often act as if they believed that they were im- 
pervious to the dangers of everyday life. Consequently, parents and educators alike 
contrive ways to protect them. Two strategies for coping with this "darigerjus" time 
are 

• Keep the early adolescent busily engaged in desirable activities— studying, 
sports, art and musk lessons— so that neither time nor energy remains for par- 
ticipating in undesirable activities. 

• Create an environment— a playpen, if you will— in which the adolescent is 
prevented from engaging in potentially injurious behaviors. The second 
strategy, unfortunately all too typical of American education, provides a safe 
environment that does not promote the development of the cognitive capacities 
of early adolescents. 

The following description of the intellectual development possible during this 
period is based Itrgdy on the work of Jean Piaget, whose observations continue 
to influence the practice of science education. While the theoretical interpretations 
and practical implications of Piaget's research have been the subject of considerable 
debate, his descriptions of the reasoning characteristics of infants, children, and 
young adults illuminate most current thinking about the learning and teaching of 
science (FlaveH, 1963; Case, 1985). Present-day cognitive researchers who are 
building on Piaget's work on the evolution of thinking tend to emphasize the con- 
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tinuity rather than the difference between the reasoning stages that Plage! has 
described (Carey, 1985). Rather than postulating that individuals engage in fun- 
damentally different ways of thinking at different stages of development, these 
researchers hold that the less-skilled thinkers, regardless of age, know a great deal 
less about the domain with which they are dealing, and it is this lack of knowledge 
and understanding of concepts in a particular domain that keeps them from engag- 
ing in the more romplex reasoning pioc^ 

is to deepen the students' knowledge and understanding so they can develop the 
higher order thinking skills described by Piaget, In the next section we summarize 
Piaget's descriptions of higher order thinking as they are relevant to science educa- 
tion, particularly the development of formal operational thinking. 

Formal Operational Thinking 

In early adolescence, students begin to display a qualitatively different kind of 
thinking about the natural world and the individual's place in it than that general- 
ly displayed by younger children. Early adolescents acquire more knowledge and 
a more sophisticated organization of that knowledge, and their intellectual 
development proceeds to the point at which scientific thinking can be observed. 
According to Piaget, formal operational thinking represents the highest form of 
human thought and is characterized by the individual's ability toe 

• engage in hypothetical-deductive reasoning, 

• engage in proposition al reasoning, 

• use combinatorial analysis and proportional reasoning, 

• reflect on one's own thought processes, and 

• consider issues and situations from different perspectives. 

HypotbetJcaMednctlve reasoning. The ability to conjecture alternatives to 
reality and to test systematically the alternatives against available data indicates 
an individual's ability to use hypotheiical-deductive reasoning. This ability entails 
controlling variables and reasoning from a set of premises. The competencies 
might be teachable (Unn and Levine, 1976), although they appear to some extent 
to depend on the formulation of a given problem. Hypothetical-deductive reason- 
ing enables individuals to have thoughts that go beyond the "here and now." Also, 
these thoughts can influence the adolescent's social and moral cognition. 

Proposhtonal reasoning. In contrast to the student in the early elementary 
grades, who tends to think in concrete, operational terms and mentally 
manipulate only real objects, the older student who displays formal operational 
thinking is capable of reasoning using abstract propositions, hypotheses, and 
quantitative relationships— at least i n familiar domains (Flaveli , 1963). Hypothetical- 
deductive and propositional reasoning are the basis for an individual's ability to 
reason scientifically 
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Combinatorial analysis and proporticmal reasoning. Components of sden- 
tific thinking include hypoth^kaWetkictive reasoning, propo^tionai logic and 
combinatorial analysis— the component most closely associated with expeiimen- 
tal design and data analysis. Tasks that Raget used to test forbypothetical-deductive 
reasoning and combinatorial analysis require the student to generate lists of fac 
tore that might account tor how a physical system hincticms—forejtanipk^ the period 
of a pendulum's swing— and then to determine which farioractufdly influences the 
system by testing each factor while holding the other factors constant Proportional 
reasoning is a mathematical skill essential to scientific reasoning. A task used to 
test for this skill involves an object (a stick figure is often used) represented by us- 
ing two different scales. Lengths of the component parts of one representation are 
given in some arbitrary unit— the length of an arm in paperclips, for instance. The 
task is to figure out how long a corresponding part is on the other representation. 

Reflective arfiAfaig. A mbwotpm and assessment «f «wn thinking processes 
are characteristics offorr al operational thinking. This quality of thinking enables 
students at the middle level to accomplish five tasks: 

• describe how they team best 

• improve their own learning, 

• assess the strengths and weaknesses of their problem-solving skills, 

• assess the extent to which they understand, and 

• assess how well they are meeting the teacher's expectations. 

Not only do these and other related skills make it possible for the students to assess 
their own work; these skills also enable them to improve themselves. 

Consideration of issues and situations from different perspectives. The 

bility to consider issues and situations from different perspectives is characteristic 
of formal operational thinking. Thus, an adolescent can engage in recursive thought, 
that is, thinking about the thoughts of others, and contrast sets of perspectives of 
self and others. Concurrently, young adolescents tend to be egocentric, even as they 
develop their ability to distinguish between their own concerns and those of others. 
As their ability to place themselves in a wider social context increases, adolescents 
begin to see themselves as having a personal and a social destiny (Lipsitz, 1977). 
Being able to shift perspectives is critical to scientific thinking. For example, spatial 
reasoning, a particular form of or nsidering different perspectives dosery, correlates 
with scientific achievement. Spatial reasoning implies 

• the skills necessary to represent the spatial relationships of objects to each other 
as they would appear from vantage points other than the one from which the 
individual >s viewing them, and 

• the skills necessary to represent how an object would appear from various van- 
tage points or how the object would appear after a linear or rotational 
transformation. 
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Another sort of competence in shifting perspective is the ability to con^der how 
others might think about a situation or an issue. The ability to develop an effective 
scientific argument is dependent on three skills: 

• generating possible perspectives that others might take, 

• determining which of these a particular individual holds, and 

• developing a line of argument to counter or complement alternative points 
of view. 

In traditional Piagetian theory, the logic structures that facilitate spatial reasoning 
also operate in the ability to understand the perspective of others; more recent 
research has underlined the importance of context and experience in the ability 
to shift perspectives. 

instructional implications. As the early adolescent's ability to reason, reflect, 
and consider other perspectives grows, the educator might be tempted to reduce 
direct experience with hands-on activities in favor of reading and writing about and 
discussion of science and technology. Although many students at this age are becom- 
ing more adept at abstract thinking, more comfortable using mathematics, and more 
skilled and practiced in using thinking skills to solve problems, they are nevertheless 
concrete thinkers most or part of the time New learning is often elusive, understood 
at one moment, slipping away at another. Therefore, problem-solving and decision- 
making skills are best practiced around a concrete, visible, memorable activity or 
a real experiment, because skills and concepts thus learned can be remembered 
from a tangible context. Moreover, hands-on experimentation in science provides 
opportunities for planning, observing, selecting evidence, formulating and ruling 
out rival interpretations—in short, learning how to impose structure on experience. 

Scientific Thinking. 

Formal operational thinking is a characteristic of scholars in alt academic 
disciplines. It is also a characteristic of the highest levels of social and moral cogni- 
tion. Piaget's detailed descriptions of formal operational thinking, however are 
drawn largely from mathematics and the physical sciences. 

The nature of scientific thinking. Scientific thinking is the product of formal 
reasoning strategies operating on a knowledge base. The structure of the knowl- 
edge base reflects the nature of the reasoning processes that store information in 
it. Of particular interest in formal operational thinking are two structural features 
of the knowledge base that arise from an individual's ability (a) to categorize ob- 
jects, events, or ideas on the basis of conceptual rather than perceptual features 
and (b) to understand concepts at a theoretical level. The knowledge base of an in- 
dividual skilled in formal operational thinking is different from that of someone 
thinking in concrete terms, because the former is capable of abstract categoriza- 
tion. Concrete operational thinking only requires categorization on the basis of 
physical attributes —objects on the basis of color, or sounds on the basis of pitch, 
for example. 
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Formal operational thinking requires categorizations of objects or symbols us- 
ing abstract features. Categorizing chemical changes according to reaction type 
{oxidation-reduction or neutralization), physics problems according to the physical 
laws that must be applied to solve them, or organisms according to their function 
in a biolojpcal system all present examples of formal operational thinking (Chi et 
al., 1981). 

Individuals thinking in formal operational terms also can understand concepts 
at higher levels of abstraction than do individuals thinking in concrete operational 
terms. Concepts can be understood on at least three different levels of abstraction, 
phenomenological, experimental, and theoretical. At the pbenomenologicai level 
of abstraction, understanding implies familiarity with the qualitative aspects of 
phenomena. Density, for example, can be understood in terms of phenomena— 
objects and substances floating and sinking in liquids and gases. However, 
understanding at this level does not imply that the individual understands the ex- 
planation, only that the individual can completely and accurately describe the 
phenomena. 

At the experimental level of abstraction, understanding density means knowing 
how to measure volume and mass and, consequently, density. While understanding 
density at this level involves manipulation of concrete objects, the fact density is 
a derived quantity, the ratio of two measured quantities (volume and mass), means 
that understanding density at this level requires proportional reasoning. Formal 
thinking is also necessary to understand the explanation for floating ami sinking 
phenomena. 

At the theoretical level of abstraction, understanding of density implies know- 
ing that density is an intrinsic property of substances, a property that depends upon 
the mass of the molecules of which the substance is composed and upon the number 
of molecules in a unit volume of the substance. The knowledge base necessary for 
formal operational thinking contains integrated information about a concept like 
density at all three levels of abstraction. 

Developing scientific thinking. According to psychological theory, three fac- 
tors affect intellectual development (including the development of a science-relevant 
knowledge base and science-related skills): physiological maturation, interaction 
with the natural world, and social experience. Developmental psychologists tend 
to downplay the influence of formal educational experiences (a type of social ex- 
perience) in the development of formal operational thinking. Other psychologists— 
the neo-Piagefians, for example— admit to ' te effects of formal education on the 
acquisition of formal thinking. For this reason, science educators stress the import- 
ance of handson science work linked to the student's experiences and accompanied 
by discussion among groups of students as well as with the teacher. They stress the 
development of conceptual schema through effective education rather than the 
physiological development of logical structures, Our report is predicated on the 
premise that this sort of well-conceived school science can contribute to the attain- 
ment of formal operational thinking. 
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Creating educational environments in which early adolescents can rapitalize on 
their expanding capacities requires thinking through the relationship between 
development and learning. The way in which the relationship between develop- 
ment and learning is construed influences the nature of the school science ex- 
perience In much of the educational literature, the distinction between teaming 
and development is blurred and the relationship between them subject to different 
interpretations. One interpretation is that development of formal operational think- 
ing occurs independently of formal education. According to this interpretation, 
cognitive developmental level is a critical factor limiting the sophistication of the 
subject matter that can be learned. This implies that the cognitive demands of lear- 
ning science subject matter should not exceed the developmental level of the learner. 

Another, more constructive view is that learning contributes to cognitive develop- 
ment. According to this interpretation, experiences with the natural world that a 
child interprets in a social environment contribute in small increments to the child's 
knowledge base and repertoire of thinking skills. When these experiences are con- 
current with the appropriate physiological maturation, one observes the "quantum 
leaps" in thinking that characterize the transition from one level of cognitive 
development to another. In practice, based on this interpretation, subject matter 
is selected for its contribution to the development of formal operational thinking. 

The Goals of School Science 

and Formal Operational Thinking 

When the goals of school science are stated in terms of the characteristics of the 
successful science student, the close correspondence to the characteristics of for- 
mal operational thinking is evident. Both the ideal product of twelve years of 
school science and individuals skilled in formal operational thinking can 

• understand scientific concepts, principles, laws, and theories; 

• criticize the design of experiments as well as design experiments; and 

• understand the sociology of the development of scientific knowledge. 

Furthermore, being able to learn on one's own and to assess one's understand- 
ing and progress toward achieving a goal also are desired outcomes of school 
science. Although the goals of science education correspond significantly to the 
operational definition of formal operational thinking, one critical difference, 
which has implications for practice in science education, centers around "know- 
ing" something and being able to "figure it out." Piaget operationally defines 
stages in the development of reasoning skills in terms of the ability to respond to 
an unfamiliar task, no matter what the domain. Thus, in his view, a correct 
response, which would include both the "correct" answer and justification for that 
answer, indicates that the reasoning structures necessary to analyze the task 
"logically" are available to the student, quite apart from exposu/e to the subject 
matter. In contrast, the assumptions underlying the goals of school science start 




Chapter il 13 



with domain knowledge; thai is, the student will know the right answer and be able 
to justify h after exposure to the relevant subject matter. Even when goals for science 
education refer to appUcafjon of knowledge and reasoning skills in unfamiliar situa- 
tions, the new situations are generally domain specific, that is, they entail scien- 
tific knowledge and reasoning skills applicable to academic personal, and civic pro- 
blems related to science. Transfer to the domain of analyzing historical exposition, 
for example, would be considered far transfer, and not expected as an explicit out- 
come of science education. 



Issues and Dilemmas 



As one contemplates the possibility of "detraumatizing" adolescence by providing 
an environment in which the developmental tasks of adolescence can be achieved, 
one must recognize the impediments to the realization of every student's poten- 
tial: lack of knowledge about the detailed nature of the formal experiences that 
help to enhance the development of formal thinking; institutional and structural 
issues that include teacher preparation and beliefs (see the Center's companion 
report on teachers and the teaching context at the middle level); and resources— 
what society is willing to invest to ensure that all but the severely mentally disabl- 
ed develop formal thinking. Among the considerations with regard to resources is 
the relative importance of the development of intellectual skills compared to the 
many other developmental tasks of adolescence. This particular issue create 
dilemmas for educational practice in general and for science education in par- 
ticular. Some dilemmas are philosophical: How does formal operational thinking 
contribute to the valued outcomes of school science? How does school science 
contribute to the development of formal operational thinking? Does society value 
formal operational thinking? If so, how much? lb what extent is society willing to 
devote its resources to achieving formal operational thinking in all youths? Some 
of the dilemmas also have a theoretical component: Are all "normal" youths 
capable of becoming formal operational thinkers? What is known about the extent 
to which the development of formal operational thinking can be facilitated? If 
development can be facilitated, how is that best accomplished? Is the develop- 
ment of formal operational thinking accomplished best through the study of 
science? If so, what should be the nature of the science experience? How do ex- 
periences with the natural world influence the development of formal thought? 
Do educational experiences that develop understanding of science concepts at a 
theoretical level and the ability to design a valid experiment create a formal opera- 
tional thinker or. rather, is it the case 'hat only the formal operational thinker can 
come to understand science? What in all this is the role of social interaction? 
Developers of science curricula and instruction need to consider careful responses 
to these questions, as well as defining the optimal conditions under which true 
scientific thinking develops. In the next two chapters, the goals of science educa- 
tion at the middle level (chapter 3) and recommendations for science instruction 
(chapter 4) are discussed. In these two chapters, special attention is paid to the 
growing capacities of young adolescents. Also, the recommendations are con- 
trasted to actual current practice. 
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Chapter HL 

Goals for Science Education 
end the Assessment Challenge 



Scientific and technologic literacy for ail citizens stands high on the list of educa- 
tional needs for the year 2000 and beyond (National Commission on Excellence 
in Education, 1983; National Science Board Commission on Precollege Education 
in Mathematics, Science and Technology, 1983; Z*k Force on Education for 
Economic Growth, 1983; Twentieth Century Fund, 1983; however, for a dissenting 
view, see Shamos, 1988). To summarize the arguments made by advocates of science 
education: Not only will the economy require an increasing number of scientifically 
and technically trained professionals and support personnel, but most production 
and service jobs will require a modicum of quantitative and technical skills (Botkin 
et at, 1984; but see levin and Rumberger, 1983, for counterarguments). Moreover, 
an increasingly complex interlinking of the man-made and natural worlds makes 
it important for people to understand the basic parameters of both these worlds and 
their functioning, so that they can make effective personal and civic decisions Re- 
cent reports have interpreted in some detail the meaning of scientific and 
mathematical literacy with respect to student learning and proficiency in these fields 
(American Association for the Advancement of Science, 1989; Mathematical 
Sciences Education Board, 1989). 

The period of early adolescence can be an exciting time in science education. 
Middle-level science instruction must bridge the introduction of science as a set of 
accessible activities in the elementary years and science as a sophisticated form of 
intellectual inquiry in high school. Children exposed to good science in the elemen- 
tary schod years have seen something of the ways in which scientists approach pro- 
blems, pose questions, and collect and organize information, but they probably have 
seen little of the formal, systematic knowledge structures that characterize mature 
sckmtiffc disciplines. In the elementary years the students lack intellectual maturity, 
which limits their ability to work with abstract formal systems Also, they are just 
beginning to develop the "tool skills," especially mathematical understandings and 
symbol systems, necessary to work with abstract scientific concepts. Science i»,struc- 
tion in high school is grounded in the scientific disciplines. It is formal, rigorous, 
and often quantitative. Therefore, middle-ievel science instruction divides eJemen- 
tary science and high school science by introducing the power, excitement, and utili- 
ty of formal scientific systems without communicating to children that real science 
is only comprehensible to the brightest students, the mathematically precocious, 
or boys. 
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Education cf young adolescents has the dual purposes of providing for their con- 
tinued personal development and fulfilling the aspirations of society. Early in this 
century, the literature on junior high schools and in recent decades the literature 
on middle schools has continuously emphasized the goal of personal development 
for early adolescent students. While personal development us a goal is appropriate, 
middle-level educators should not lose sight of the second goal, that of contributing 
to the society in which the adolescent lives. Putting it more succinctly, the aims of 
science education in the middle years are to develop the student's capacity to 

• think scientifically and use the tools and strategies of science, and to 

• apply science knowledge and skills in addressing individual and societal 
problems, 

These broad aims lead to several more specific goals 



Goals for Science 
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The goals as spelled out represent general directions. White all students may not 
attain all goals with equal proficiency and understanding, all students should 
develop some proficiency and understanding. The goals stated lor the elementary 
years in the Center's earlier reports (Raizen et al., 1989; Loucks-Horsley et al., 
1989; Bybeeetal., 1989) share common elements with those stated here, as the 
Center sees articulation of subject matter drawn from science education across 
grade levels as a critical element of reform. In particular, we continue to em- 
phasize the importance of teaching both science and technology and connections 
between them as well as the importance of engaging students in activities relating 
to science and technology. The variations between goals for elementary and mid- 
dle levels l x based primarily on the student's developing capacities, as described 
in the preceding chapter. 



Goal 1: Science and technology education should develop adoles- 
cents' ability to Identify and clarify questions and problems 
about the world. 

Young adolescents are first and foremost interested in questions and problems 
that relate directly to them. Constructing a middle-level curriculum could easily 
begin with such questions as what is normal? Why do organisms behave the way 
they do? How are things made? Why do things change? What are the relationships 
among things? These questions are intentionally ambiguous \bung adolescents 
seldom state questions with immediate personal connections—Why do I change? 
Am 1 normal?— although these questions are probably closer to their interests. 
The point here is to begin with questions and problems that have meaning for 
adolescents, rather than with concepts and skills that have scientific and 
technologic significance but seem abstract and removed from life. Although 
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adolescents are likely to have many questions about themselves and their surround- 
ings that are related to science and technology, they may not see this connection 
until they pursue their questions in greater depth. Asking questions and identify- 
ing problems are the first steps in scientific inquiry and technologic problem solv- 
ing, it is appropriate, therefore, to introduce students to science and technology 
education in response to their questions and problems, 



ofc ta t lo nd and critical thinkin g afcffis far ammriag questions, 
solving problems, and making decisions. 

As they develop explanations and solve problems, scientists and engineers use 
cognitive processes and intellectual models that differ from those that people com- 
monly use. Observation, experimentation, and construction of theories in science, 
as well as consideration of cost, risk, and benefit in technology, are examples of the 
processes included in this goal. Adolescents should learn what and how scientists 
and engineers think and why they think the way they da Students need an introduc- 
tion to the intellectual rigor and demands of scientific inquiry and technologic prob- 
lem sotving-the need for evidence, the use of logic and creativity. Learning to for- 
mulate sound and coherent explanations and developing a nonauthoritarian, skep- 
tical posture are also important. In addition, students need to acquire the social and 



connects to other general aims important to middle-level education. Among these 
other aims are the development of adolescents' operational and critical thinking 
skills and their physical, social, and emotional capabilities. 



Goal 3: Science and technology education should develop adolescents' 
knowledge base. 

Knowledge must be a central concern of science and technology education. Tradi- 
tionally, the science curriculum (including that designed for young adolescents) has 
consisted of facts, information, and concepts that represent the life, earth, and 
physical science disciplines. The criterion for selection and inclusion of subjects 
has been that the curriculum should represent the accumulated information within 
each discipline. The task of the teacher has been to present the information. Tests 
were used to determine what information the students had retained. 

The Center recognizes the importance of adolescents* ability to acquire and apply 
Knowledge in personal and social contexts, and the Center's goals reflect this. We 
recognize the dynamic nature of science and technology and thus recommend 
presentation of scientific and technologic knowledge as proposed explanations and 
proposed solutions. The emphasis should not be on trivial facts and isolated infor- 
mation. Rather, the emphasis should be on the acquisition of a knowledge base, 
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the acquisition of concepts that unify disciplines within the sciences, and an 
understanding of technology. The focus of middle-level education should be on 
knowledge, concepts, and procedures that have the widest potential for use in one's 
life, and Ihat help the individual meet family and societal respc*isibiliucs. Since lear- 
ning c>ften proceed from sperificc^ 

reaching a deep, rich, and rewarding appreciation and understanding of a relative- 
ly small number of carefully chosen phenomena thai provide opportunity for broad- 
ly applicable methods of inquiry. 



Goal 4: Science and techiH>togy education 

understanding of the history and nature of science and 
technology. 

Adolescents also need to understand science and technology as cultural and social 
activities. Historical and present-day examples can vividly illustrate how society and 
culture influence science and technology anH how technology and science influence 
culture and society. Thus, the social context in which scientific explanations and 
technologic solutions are presented determines their shape. By the same token, 
some iicientffic and technologic events have historical significance and have helped 
shape Western culture. Consider, for example, the revolutions of Nicolaus Coper- 
nicus, Isaac Newton, and Albert Einstein; the contributions of Charles Darwin, 
Charles Lyell, and James Watson and Francis Crick to the current understanding 
oi the processes of evolution; and the roles of such individuals as James Watt in the 
industrial Revolution. These developments have had significance beyond their 
scientific content and technologic products; indeed, they have changed world culture 
(By bee, 1990). There is another important reason for spending time on the history 
and development of science and technology. Students' conceptual understanding 
of the world sometimes appears to parallel that of history; for example, many 
students have an Aristotelian view of nature. Presenting different historical perspec- 
tives, while affirming that others have perceived the world the way some people 
do now, can serve to challenge problematical conceptions that students might hold 
and provide structures for reformulating their explanations, Adolescents should 
begin developing an undeistanding of the nature of science and technology. They 
should see science as a particular way of knowing about the world, and technology 
as a way peopie adapt to their environment. How do the sciences and technology 
advance? What constitutes a valid scientific explanation? How is science different 
from other ways of knowing, such as history, literature, or religion? Is technologic 
problem solving different from other forms of problem solving? Science for All 
Americans (American Association for the Advancement of Science, 1989) provides 
examples that further clarify both what this goal includes and the conception of 
science and technology that we hold at the Center. The adolescent should under- 
stand that science assumes the world is understandable, that scientific explanations 
are durable but subject to change, and that science cannot explain all things. Con- 
cet rung technology, adolescents should understand the Interactions between science 
and technology, that technologic problem solving involves design under constraint, 
that technology involves control, that technologies can have unintended conse- 
quences, and that technologic systems can fail. 
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Goal S; Science and technology education should utommoe adolesc e nt*' 
understanding of the ttmlts and possibilities of science and 
tectmotogy to explaining then atunUwot^ 
liwMiimi 

Science and technology directly relate to contemporary American life. They serve 
as agents for social change, and, in turn, they are changed by society. Individuals 
aid nations are incmasingy being asked to make dedsk>n$tiiat influence the quality 
of life. Uncerstandintf the limits and possibilities of science and technology has a 
direct bearing on the goals for general education in the sciences. This goal encom- 
passes the need to develop personal decision-making abilities. This goal also ex- 
pands the adolescent's potential for meaningful work and careers and cultivates the 
adolescent's citizenship responsibilities. These goals represent an integration of our 
conception of science and technology ami the major orientation of middle-level 
education. The task is to see that young adolescents develop in a personal and social 
context, the most complete and accurate undemanding of science and technology 
that is possible at their age and stage of development . Not only is H i mportant for 
them to understand the processes, tteo>ncq^theh^(^arKiti^n^recrf9C^K^ 
and technology; it is equally important that these adolescents recognize what science 
and technology can and cannot do, what they are and what they are not, and how 
they do and do not influence individuals and society. 

The Assessment Chalknffi 

The goals of middle-level science instruction have implications for assessment. 
The classroom tests teachers develop and use both express their own understand- 
ings and also communicate to their students what is important to learn from 
science instruction. If only new vocabulary is tested, there is an implicit message 
that science is mostly a matter of memorizing new terms. If only factual 
knowledge is tested, the message may be that science is a static body of facts, prin- 
ciples, and procedures to be mastered and recalled on demand, if tests call for the 
students to engage in active exploration and reflection, to pose new questions and 
wive new problems, tht message can be that science is a mode erf disciplined in- 
quiry, applied specialised knowledge, investigative procedures, and rules of 
evidence for understanding both the natural world ami the technologies through 
which humans have shaped that world to their ends. Even in elementary school, 
children can use classroom tests to help them understand what they should be 
learning. But during the middle years, with the growth of the capacity for abstract 
thought and especially for reflection about one's own learning, the messages 
students receive from classroom tests assume increasing importance. 

Classroom tests communicate not only the character of the teacher's intended 
learning outcomes, but also the level at v/hich mastery is expected. If standards 
and expectations are set too low (perhaps in a well-intentioned but misguided ef- 
fort to accommodate the special needs, diversity, rapid physical growth, or 
presumed cerebral dormancy of young adolescents), the students may inter that 
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they are not really expected to master difficult concepts. Low standards and expec- 
tations might retard learning, and consequently the transition to high school science 
will be needlessly difficult, or, worse, might never occur. Vet, if standards are set 
too high, the effect can be to reinforce the unfortunate stereotype that science is too 
difficult for most students. 

Early adolescence is a time for exploration and experimentation, a time when 
students may test their interests and capabilities in a variety of content areas and 
form enduring impressions of different subject matters. Although career choices are 
rarely established until much later, the impressions early adolescents form, aini the 
decisions that they, their parents, teachers, and counselors make about tracking and 
courses to take in high school, profoundly affect their options for postsecondary 
education and their future vocations. Middle-level students' understanding of their 
mathematical and scientific abilities, and their decisions about courses they take 
in these areas, are far too important to be left to chance. In particular, the middle- 
level science program should acquaint youths with the wonder and excitement of 
formal science, and aid them in reaching an honest, but optimistic, assessment of 
their own capabilities to profit from future scientific study in high school. Sound 
classroom testing practices, including fair and consistent standards and expecta- 
tions, can further these ends. 

If science and technology education successfully address these goals, they will 
faster three types of outcomes: increased factual and conceptual knowledge; increas- 
ed laboratory, thinking, and social skills; and increased disposition to apply one s 
knowledge and skills to unfamiliar situations. Increased learning in these three areas 
is a prerequisite both for scientific literacy and for preparing to enter scientific or 
technical careers. The assessment challenge is how to adequately probe the students' 
competencies in all three of these areas and how to avoid certain adverse effects 
of testing. 

Science Knowledge 

The knowledge category includes the "what" of science and technology- 
knowing facts about the natural and man-made worlds, for example, understand- 
ing that sounds are patterns of motion and that the sounds of instruments or one's 
voice vary as vibrations vary; knowing that rivers are part of the water cycle and 
knowing how their power is translated into electric energy; and understanding 
the functions of primate groupings and social interactions. Also included in the 
knowledge category are the concepts, principles, laws, and theories that scientists 
use to explain, for example, how vibrating strings relate to sound, how heat 
energy from the sun drives the hydrologic cycle, and the role of communication 
among primates. Beyond facts about the natural world, the theoretical knowledge 
used to compose explanations for these facts, and an understanding of how factual 
and conceptual knowledge are applied appropriately, the science knowledge 
category also— as noted in the discussion of goals— includes knowledge about the 
scientific and technologic enterprises: their history, methods, philosophy, and 
values and their influence on human existence. 
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Assessing adence knowledge. The first task in assessing the science knowledge 
acquired by students Is deciding which categories of that knowledge are to be pro- 
bed, and what knowledge within each category should be represented on a test. 
Once these decisions have been made, testing of tactual and theoretical knowledge 
and knowledge about the scientific and technologic enterprises can be carried out 
with relative ease, using paper and pencil. Often, short-answer or multiple-choice 
items are used. This type of assessrr^t format al 

the test in group settings; hence, the exercises making up the assessment can be 
given to a large number of individuals. Because of the relative ease and efficiency 
of paper-ami-pencil tests, particularly those—like muWple-choice--that can be 
scored by machines, most tests intended for monitoring purposes, that is, providing 
national, state, or district-wide information on student achievement, take this for- 
mat (for example, state-mandated tests, commercially available standardized tests, 
and tests used by the National AssessPment of Educational Progress and in inter- 
national companions). Unfortunately, all too often, multiple choice items test recall 
of unconnected bits of information, thereby conveying a distorted message about 
the nature of science. Assessments, however, need not be limited to this form of 
test, leachers, in particular, have other strategies available to them. They can design 
essay questions and review written and oral reports. They can also use more infor- 
mal methods for gauging their students science knowledge and embed assessment 
'if what knowledge has been learned in more holistic assessment strategies, as 
described in later chapters. 

Interpreting tests of science knowledge. Tests intended to assess science 
knowledge have a second important characteristic. For well-constructed exercises, 
the responses can be interpreted with reasonable certainty. A correct response in- 
dicates that the individual either knew the information required for the answer, or 
was able to figure it out using information provided as part of the question. (Of 
course, it could also be a lucky guess.) Determining the correctness of the response 
does not need to take into account the thinking skills the individual might have 
applied in comprehending the written item, in retrieving the fact from memory, in 
reasoning from the information in the Hem to the correct answer, or in eliminating 
incorrect responses. In other words, the concern is not with the means individuals 
may have used to access the information or the reasons for their conclusions, but 
only in whether or not they have presented the correct information. Hence, 
responses to factual items are relatively straightforward to interpret, whereas inter- 
pretation becomes increasingly more difficult for items intended to test skills and 
dispositions 



Skills. 

Meeting goals beyond knowledge acquisition entails developing four interrelated 
types of skills: practical laboratory skills, scientific intellectual skills, generic (for- 
mal and practical) thinking skills, and social skills. 
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Assessing practical laboratory ridfla. Skills development for the mkkflc-ievel 
students should build upon their experiences in the elementary years. For exam- 
ple, in their early years, children learn to measure mass by using a double-pan 
balance to compare different objects with uniform masses, such as paper dips, 
thumbtacks, weights. At the middle level, they should be able to move on and 
discuss the accuracy of their measurements and possible sources of error By the 
time the students reach high school, they should understand the mechanics of 
measuring mass (and the connected uncertainties) well enough that they canuse 
a digital balance, and thereby save weighing time and focus instead OTinlerr^ettag 
the usefulness or significance of the data. 

Middle-level students should be abte to measure length, volume, mass, time, and 
temperature, using instruments capably and Quantitative data comfortably. They 
should also be able to use a microcomputer independently to enter, store, and 
retrieve data and to simulate experimental conditions. 

Assessing laboratory and computer skills requires laboratory equipment, 
materials, and computers. This sort of assessment distinguishes between knowing 
about how to do something, which can be probed with paper-and-pencil tests, and 
having the competence to do something, which cannot. To assess the latter, assess- 
ment techniques need to closely match the ability to carry out a given sdentific pro- 
cedure or design task. Obviously, this type of assessment is more difficult to ad- 
minister and score and requires more material resources than do paper-and-pendl 
assessments, Nevertheless, NAEP conducted a successful pilot study in this area 
(National Assessment of Educational Progress, 1987), and Connecticut, New \brk, 
and California also are now experimenting with inrorporating performance tasks 
in their science assessments. In science classrooms that include science activities 
as a regular part of instruction, teachers have many opportunities to observe these 
skills in action, with the added benefit of being able to do corrective teaching as 
deficiencies manifest themselves. 

At the middle level, observing, classifying, measuring, and other laboratory skills 
useful for gathering information will recede from prominence, being no longer ends 
in themselves. This aids the assessment problem to some extent, as the students 
will be able to record in a journal or notebook observations and data that can be 
easity monitored by a teach^. The im^ 

can be made clear to students by presenting challenging and meaningful problems 
whose solutions depend, at least in part, on the accuracy of measurements made 
over time and the careful recording of changes in experimental conditions. Examples 
of relevant activities are given in chapter 5. 

Assessing scientific intellectual skills. These skills include the ability to 
generate a hypothesis; to design an experiment that is a valid test of a hypothesis; 
and to collect, reduce, present, interpret, and analyze data (Frederiksen and Ward, 
1978). Skills related to technology include the design and building of artifacts in- 
tended to perform a specified function. The combination of intellectual skills rele- 
vant to science and technology also includes procedural knowledge— knowing 
"how" to apply the "what," or the factual and conceptual knowledge and laboratory 
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skills one has acquired. Procedural knowledge is the key to addressing unfamiliar 
scientific questions or operational problems that may arise in the course of one's 
work, for students as weil as for working scientists, engineers, or technicians. 

The developing ability of early adolescents to reason deductively; to remove 
themselves a bit from their experience, and to see how things might look to another 
observer should enable them to be more flexible and open minded as they ©amine 
data, A middle-levei student can be expected to understand readily that the failure 
to be able to report a result must be explained some way. Middle-level students 
should also understand that they must present the data as they observe them, not 
as they think the data ought to be The students should continue their practice from 
elementary school of estimating and using the words greater, less than, the same 
as, and they should now use their estimates to question whether measurements 
or calculations are accurate and reliable They should be developing enough self- 
confidence to report what they actually see and to understand the role that hones- 
ty plays in the scientific enterprise. By the end of their rniddle-leve! education, they 
should be able to criticize their own work and monitor their own thinking. 

Assessing the intellectual skills of science— hypothesis generation, experimen- 
tal design, data collection, data analysis, and data mterr^atton— introduces ad- 
ditional confounding factors. Scientific intellectual skills integrate a complex variety 
of generic thinking skills with the ability to select and perform appropriate prac- 
tical laboratory skills. The following example starts out with a measurement pro- 
blem, but quickly expands to a potential test of scientific thinking skills 

In most tests and assessment exercises, scientific intellectual skills are assumed 
to be generic skills that the student should be able to use in any scientific context. 
However, many testing experts disagree with this assumption, and they argue that 
familiarity with the context of the assessment exercise and the science knowledge 
available to the student are more important factors in the ability to perform an ex- 
ercise than the scientific intellectual skills. It is certainly conceivable that a student 
could succeed by using either science and context knowledge or scientific intellec- 
tual skills. This makes interpreting a student's performance quite difficult, particular- 
ly if the test is externally designed and scored. 

Assessing generic thinking skills. Included in this category are problem- 
solving skills and quantitative, logical, and analogical reasoning. These are com- 
ponent skills of scientific intellectual skills as well as intellectual skills associated 
with other disciplines (Nlckerson, 1988). The problems of designing an assessment 
exercise and interpreting a student's performance, that we described previously, 
severely affect the assessment of generic thinking skills. The difficulty lies in inter- 
preting the behavior an assessment exercise elicits, and, again, this interpretation 
Is especially difficult when the assessment is out of context. 

When a student performs an exercise and gives an answer, one has no way of 
knowing the mental processes and knowledge the student used to arrive at the 
answer. For example, if a student is given a description of a physical event and asked 
to explain it, the student's correct explanation may be the result of simply being 
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familiar with the situation and knowing the explanation for it. Alternatively, the stu- 
dent mt$ti be unfamiliar with the situation, yet be able to receptee that a particular 
scientific principle applies to the situation. The student can then apply the princi- 
ple with the appropriate reasoning skills and come to a coned answer. Another 
possibility is that the student uses incorrect Information when developing an ex- 
planation, but uses a coned scientific principle and rules of logic while coming to 
an incorrect answer. On the basis of the answer alone, the examiner cannot possibly 
know whether the performance represents recall; lo$caJ application of corned tac- 
tual information, a scientific prindple, and rules of logic; or right thinking with wrong 
information. 

Assessing social skills. Humans are generally social beings* ami young 
adolescents are particularly sa Consequently, the middle years d education are the 
best time to complement the students' growing facility in general thinking, science 
thinking, and laboratory work with skills for working effectively in groups The need 
for developing social skills in science pows not only out of an inteiest in improv- 
ing the students 1 social skills per se— although that is an important goal—but also 
out of the dose connections among social skills, learning, and science. Sodal skills, 
such as listening carefully and respectfully, exchanging ideas and information, 
welcoming a diverse array of approaches to solving problems, ami acknowledging 
that a variety of "right" answers (or reasonable interrelations) are possible, are some 
of the skills the students require, Such skills enable the students to grapple active- 
ly and productively with complex knowledge and ambiguous problems. Given a 
problem or task that is within their capability to solve, students who are working 
together can be expetied to take on challenges that require perseverance and com- 
mitment. Moreover, when young adolescents employ their developing skills in * 
science learning and do so in working groups, the classroom becomes a replica- 
tion of a community of science scholars pursuing scientific knowledge as a social 
activity. Thus, the students begin to team about the culture of science ami to learn 
skills valued in the workplace, where the application of science usually proceeds 
through teams working together. 

Assessing social skills is difficult. Written communication, such as laboratory 
reports, reports on the design am] instruction of artifacts, or essays on a particular 
scientific or technok>$c development, can be evaluated both for their scientific con- 
tent and the quality and appropriateness of language use. But most communica- 
tion skills involve direct interaction with other persons, and these skills are best 
observed during group work. Time for such observation can be short and interpreta- 
tion difficult, particularly for science teachers in schools with departmentalized struc- 
tures who may see as many as 125 students in the course of a school day. At the 
classroom level, spot diagnosis of problems related to social skills (or lack thereof) 
during noma) monitoring erf the classroom process may suffice* With respect to 
large-scale assessments, the need for highly trained observers would make valid 



35 



o 26 Goals Science Education and the Assessment Challenge 

ERIC 



and tenable assessments of soda] skills expensive and feasible, perhaps, only for 
small subsamples. 

A f tt suing dflsposftloiis and habits el mtod. Acquiring a knowledge base in 
scrence arid developing the skills to apply the relevant knowledge to academic proo- 
lems in school are necessary, but not sufficient in themselves. Unless science educa- 
tion inclines one to apply scientific knowledge and skills to new situations in one's 
work, daily life, and when one makes personal and social decisions, neither the goal 
of developing productive sdence professionals nor the goal of scientific literacy for 
alt citizens will have been achieved. 

Sdence education in the middle years should continue to address dispositions 
and habits of mind as much as development of content and skills. When planning 
curriculum, instruction, and assessments, one must take into account the assump- 
tfcms and attitudes students have about the nature of science and technology and 
about themselves in relation to science and technology. 

Goals for developing scientific habits of mind or scientific attitudes do not change 
at the middle level per se, but the emphasis should take advantage of the interests, 
needs, and strengths of students as they move through adolescence. Some of the 
most important scientific attitudes and dispositions that students should come to 
understand and practice are 

• Desiring knowledge. The curiosity and desire to know and understand the 
world should have been nurtured in the elementary years and should be sus- 
tained. The questions a student is asked should increase in complexity. For ex- 
ample, whereas five- to ten-year-old students find physical and chemical 
changes interesting in and of themselves, a teacher at the middle level might 
have to plan a discrepant event, such as boiling water in a flask with ice cubes 
around it, to stimulate questioning about the way the world works, 

• Being skeptical, Getting students to question authoritarian statements and 
increasing their confidence in independent thinking should be further 
developed at the middle level. Because students in these years will become in- 
creasingly able to understand another point of view, tradeoffs, risks, and 
benefits; and because they will be increasingly able to take responsibility for 
their own health and safety, this is an important attitude to cultivate and should 
greatly interest young adolescents. 

• Beiytng on data and reiving on reason. Development during the middle 
years should enable the students to become increasingly able to collect and 
organize data and to use data to test ideas. Adolescents' ability to reason, to stand 
back and take another perspective, are strengths that will help them develop 
this habit of mind. 

• Accepting ambiguity. Although students often hope for a "right" answer or 
clear solution or outcome to experiments or problems, in practice data in scien- 
tific and technologic problem solving are often ambiguous. The notion that con- 
clusions in science are tentative is a habit of mind that ought to be more clear- 
ly understood by students as they move through middle-level science education. 
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These habits of mind— desiring knowledge, being skeptical, accepting ambiguity, 
and relying on sound information and reasoning with It— characterize the disposi- 
tion toward continued engagement with science. 

Making judgments about the extent to which students have acquired the habits 
of mind that dispose them to apply scientific knowledge and skills outside the for- 
mai classroom setting adds yet aiKAerte^dcon^>lei^ to assessment <^»m^tt 
attempt to assess disposition by the use of a self-report that is, describing situations 
and asking individuals to indicate whether or not they would take a "scientific" ap- 
proach to analyze them. This method has not yielded particularly trustworthy in- 
formation (Gardner. 1975; Munby, 1983; Mumarte and Raizen, 1988). A more ap- 
propriate method is to observe individuals and determine whether they use a scien- 
tific approach to personal and civic problems. This method requires extensive 
resources and, even when attempted, the direct observations that result are difficult 
to interpret. Does failing to take a scientific approach indicate that the person has 
the inclination but not the requisite skills? The requisite skills but not the inclina- 
tion? Neither? In addition, context has a profound influence on behavior. For ex- 
ample, not being scientific in approach in one situation might be an indication that 
either the skills or the inclination are not in place. An alternative interpretation is 
that the person did net deem the scientific approach or the solution suggested by 
that approach appropriate for that particular situation, but would demonstrate a 
scientific inclination in other situations. 

A possible way around these dilemmas is to measure observable behaviors, for 
example, the students' interest in voluntarily undertaking science activities beyond 
prescribed classroom work (and subsequent enrollment in science electives), the 
students' self-monitoring of their work and their monitoring of peers. Teachers might 
add observations on these behaviors to the records they keep on their students. Con- 
ceivably, some structured performance tasks might also provide opportunity for 
observing these behaviors, particularly if the tasks call for sustained work. At this 
stage of understanding, however, much more reseat zh is needed in this area to iden- 
tify behaviors that are reliable indicators of future willingness to continue an engage- 
ment with science and continue to apply one's science knowledge and skills. 

Many assessments have included measures of attitudes about science as a way 
of getting at present and future scientific dispositions and habits of mind. These 
assessments ask whether the students like science lessons or their science teachers, 
whether they value the contributions of science to society, and whether they have 
plans for science careers (HueftJe etaJ., 1983; Mullis and Jenkins, 1988). These sorts 
of attitude measures have two kinds of shortcomings: results are often paradoxical 
(for example, "I like my science teacher" but, from the same student, 'science class 
is boringS) and difficult to make sense of (Munby, 1983). Further, the linkages 
between attitudes about science— even if they could be better assessed— and 
achievement, let alone later dispositions to engage with and use science knowledge 
and skills, are open to question (Wiilson, 1983). Other proxy variables that resear- 
chers have used to assess scientific dispositions and habits of mind include inv 
pulsivity, attitude toward one's own competence, and fair-mindedness (Nickerson, 
1988; Rowe, 1979). Further work will have to be done before the proxy variables 
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can be linked with any confidence to scientific dispositions and habits of mind, in- 
eluding the disposition to apply science knowledge and skills. 

The Effects of Age. 

Age and its correlate, level of cognitive development, is another confounding fac- 
tor in science assessment. Performance on a problem- solving exercise for a ten- 
year-old might well be recall of information for the thirteen- year-old. Also, the 
thirteen-year-old will be able to bring a greater wealth of experience to the exer- 
cise. Moreover, the relevant experiences available to one youngster might be very 
different from those available to another who grows up in a different environment. 
For example, there is evidence that girls, even at an early age, have different 
exposure than , do boys to certain experiences— fixing simple electrical or 
mechanical things, playing with motor-driven toys, building tree houses, using 
scientific equipment— relevant to solving some science problems (Mullis and 
Jenkins, 1988). As age is easily established, it can be factored into interpretations 
of assessment of performance, but the role of experience is difficult to take into ac- 
count unless an assessment specifically collects relevant background informa- 
tion, as does NAEP (Huefrle et al., 1983; Mullis and Jenkins, 1988). 

Learning ore Time 

The problems inherent in assessing complex learning outcomes can be analyzed 
in a more genera) fashion. In an article in the New Directions in Measurement 
series several years ago, Snow (1980) described a 'continuum of rekrent generality" 
in both aptitude and achievement measurement. Referent generality refers 
roughly to the range of situations to which a given aptitude or achievement per- 
tains. At the highest level, there might be aptitudes like general mental ability 
(R1QS) or the kind of broad, complex developed achievement measured by the 
SAT At the lowest level, there might be aptitudes like "speed of response time" or 
achievements like "two-column subtraction with borrowing." Important science 
learning outcomes are likely to be higher in referent generality than narrower 
learning outcomes. Examples are students' understandings of scientific method 
or of such higher, level knowledge as the relationships between structure and 
function, the meaning of scale, or the concept of systems 

Outcomes higher in referent generality are harder to teach directly, because 
they must be visited time and again, in a range of contexts, using different 
materials and different illustrations. They are harder to assess because they are 
less closely tied to any particular learning activity. The problem is how to assess 
understanding of the broad organizing principles, the inquiry approaches, and the 
ways of knowing that characterize science in the context of a particular learning 
unit, given that these understandings may take years to develop. The problem is 
not unique to science, nor is it solved in other content areas. 
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The Context of Science Edi 
in the Middle Yean 



S^Z^T" * m ***** * ^ ^sessment, and good assess- 
mem Is possible only m the context of good science education. Therefore before 
fussing strategies that can respond to the assessment challenges posed by the 
St^ST ° Utlinedin . the previous cha Pter. we nJTcSS 
f^^!^ pn ^ 1 ™ Kk}ress these ^ outcomes. We begin by briefly 
ouj Hningthe broad context: a widely held and, among many, a deepL 2u«rf 
Wrefc about the special needs of young adoiescents LTite roiTsch^sIrS 
n^£?S£ ^^mmodating these needs. We then characterize what ac- 
hate* Place for most students at the middle levd: s(^ organizational ar- 
mn^t^ sdence curricula, science teachers and their workirX^S we 

22 ^ t m the * U<tentS ' **** to ^^nd perhaps Z 
^^r amtud«toward and teaming of science. We condude by^Sng 
someof^e tmpl,cat,ons for assessing student outcomes and key features of science 

What k Different about 

Middle-level Education? 

Middle-ieveJ schools-junior high, intermediate, and middle schools-are 
5 most P°*«™ ^ to recapture millions of youth adrift 
a^to help every young person thrive during early adolescence. Yet all too 
^llT ^^^a* ^ P^lems of young adolescents. A volatile 

the ^"^tion and curriculum of middle-level 
ufTt^ ^!" le f "^^youngadotescents. Caught 

dLinuw ,i an8in8 de ? and ^ ^ yo^hs" engagement in learning 

£25* f "FT! absenteeism, and 
dropping out of school begin to rise. 

J^^t 0 " 0 * l !*t?**^ k F ° rCe0n * Voung Adolescents in 

k!s^„T : in ! P ° intS: p ^'ng American Vbuth for the 2!st Century pn > 

level education as unusually strong, even startling. Vet Carnegie's current claim 
mtghtbellttlemoretnanthe most recent expression of an 
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acknowledge the unique and challenging needs oMen- to fourteen-yearolds. and 
the general failure of schools to meet them. 

Beliefs about Middle-level 

Students and What They Need - 



Ions before there was an understanding of the developing cognitive competen- 
cies d students i n the middle years (discussed in chapter 2), psydwto#stejand 
educators called attention to the significant and often tumultuous flj^^ff; 
and emotional changes in young adolescents. They also promoted the notionthat 
schools should better accommodate the special needs of this age &™Pj™* 
were not without influence. As early as 1909, changes in school ofganizatfon that 
separated grades seven, eight, and nine from the later high school years stemmed 
largely from G Stanley Hall's (1905) work on the psychology of a^escei^. And 
interest in the special characteristics of young adolescents and what these 
characteristics imply for school programs has continued thrc^ghout the century 
A decade ago, for examoie, the National Society for the Study of Education 
published Tbward Adolescence: The MidJle School Wars A 
theme running throughout its chapters is that students in the middle school years 
are different: they have a high degree of intellectual curiosity; a wide range of 
skills, interests and abilities; and they prefer active involvement to pass^learn- 
ing activities. Not only were young adolescents found to be different from their 
older and younger schoolmates, they also were observed to differ considerably 
from one another-in physical, social, emotional, and cognitive development 
(Maurice. 1980). 

TWo syntheses of writings on the special mission and functioning of junior high 
schools, one in 1940 and the other in 1970, show that throughout the past fifty 
years, educators have been remarkably consistent in the educational implicates 
they draw from these characteristics of students (Gruhn and Douglas, 1971)Jn 
1987. Hurd summarized the most common recommendations for schools 
designed to serve young adolescents: 

1. Integration. Learning experiences should be integrated "into effective and 
wholesome pupil behavior" as well as link the subjects in the curriculum. 

2. Exploration. Schools should lead students to discover and explore their 
own interests, abilities and skills and provide opportunities to include 
"cultural, social, civic, avocational and recreational interests" as a basis for 
vocational decisions. 

3. Guidance. Assistance should be provided to enable students to make in- 
telligent educational and vocational decisions and wholesome ^ocial and 
personal choices. 

4. Differentiation. Opportunities should be provided that accommodate 
* students of different backgrounds, interests, abilities, and needs. 
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S.Sociallzation. Learning experiences should be included that will enable 
students as citizens to participate in and contribute to this country's democratic 
society 

6* Articulation. Schools should provide students with help in acquiring the 
backgrounds and skills that will welp them succeed in late education and adult 
life. 

Recent changes in the social milieu of adolescents have not altered these long- 
standing beliefs about what soools should provide students at the middle level. 
Several recent reports have called for programs that focus on the physical, sociai, 
and emotional needs of adolescents as well as on their academic learning. These 
reports also recommend programs that provide integrated curricula, exploratory 
experiences, and opportunities for dose relationships with adults and peers (for ex- 
ample, see the National MiddleSchool Associations report, This We Believe [1982]; 
the report by the National Association of Secondary School Principals, An Agenda 
for Excellence at the Middle School Level [1985]; ami the report of the Superinten- 
dent's Middle-grade T5*k Force, Caught In the Middle- Educational Reform for Vbung 
Adolescents in California Public Schools [1987]). The one prominent recent addi- 
tion to this long-standing agenda is Ihe goal of helping students learn how to learn 
and think- Most likely, this new program goal follows from more recent understand- 
ings of adolescents' cognitive development. 

While the perceived importance of academic preparation (versus meeting the 
students' developmental needs) has waxed and waned throughout the century, most 
advocates for ten- to fourteen-year-olds seem to want both, and most believe that 
both are possible with an integrated, exploratory, and flexible middle-level program. 
For example, the National Middle School Association (1982:10) asserts that 

The curriculum must carefully balance academic goals and other human 
development needs. A middle school cannot succeed in fulfilling its educa- 
tional responsibilities if it ignores non-cognitive objectives. Indeed, it cannot 
succeed in fulfilling its cognitive objectives if it does not recognize the inter- 
related affective goals. 

To achieve this balance, the association recommends that schools provide a range 
of organizational arrangements, for exampie, block scheduling, multi-age group- 
ing, and alternative schedules; varied instructional strategies with an emphasis on 
small-group methods, peer interaction, independence, and experimentation; a full 
exploratory program, for example, high-interest, short-term lessons and units, con- 
trolled student choice, mini-courses, special-interest activities, independent study 
projects; comprehensive peer and adult counseling; consideration for the wide varia- 
tion in the progress that students make; evaluation that emphasizes individual 
uniqueness; interdisciplinary curriculum planning iL^ms; and a family-like school 
atmosphere, 

Similarly the recent Carnegie Task Force on Education of Vbung Adolescents 
(1989) calls for specific changes in the organization of middle schools and their 
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curricula, teaching force, and relationships with families and communities The task 
force intends for these changes to simultaneously promote intellectual and persona! 
growth. Among its recommendations, the task force asks schools for young 
adolescenfc to "reorganize into small communities or 'houses' thai foster close rela- 
tionships between students and teachers; teach a cere academic program that results 
in literacy (including scientific literacy), critical thinking skills, healthy living skills, 
ethical behavior, and responsible citizenship; ensure success for all students by 
creating heterogeneous, cooperative, flexible, and resource-rich {earning en- 
vironments; improve academic performance by promoting heakh and fitness, and 
providing access to health care ami counseling." 

NotkepUy absent from any of these writings on education for young adolescents 
and wh it they need from school is the more traditional approach of teaching the 
disciplines as disciplines. In fact, the concept of the middle-level grades as a "junior" 
high school characterized by the organization, curricula, instructional strategies* 
and psychosocial environments of senior high schools is seen as antithetical to the 
developmental and intellectual needs of students of this age The fad thai most 
schools serving students in their middle years follow this traditional pattern (as we 
will report in more detail below) has provided the impetus for the middle school 
movement. 

Science for Students in the Middle Years 

What do these beliefs and recommendations for middle-level education imply for 
science education? In the 1960s am) 1970s, reforms in science education for 
middle-level education focused on the need to produce more scientists. The 
primary goal of most of the new discipline-based curricula developed during this 
period was to prepare students for further study of science (Hurd, 198V; Weiss, 
1986)* The few attempts made to create interdisciplinary curricula for the middle 
level came late in this reform period and proved less acceptable to the schools 
than the earlier discipline-based curricula. In recent years, howi »r, a number of 
science educators and researchers have drawn on the more general middle school 
literature to recommend new directions for science programs for young 
adolescents 

For example, Yager (1988: 1 2) has suggested that middle-level science programs 
can better accommodate the needs ami interests of young adolescents by focusing 
science lessons on the students themselves, "what they bring to the study, what 
they can do, and evidence of growth in various domains." He has recommended 
that program goals be oriented toward the student and that curricula ami teaching 
strategies be based on the past, current, and future experiences of students, 
Similarly, the criteria developed by the National Science Tfeachers Association 
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(\ager, 1988) for selecting exemplary middle-level science programs include 

• emphasis on learning how to learn; 

• learning science in real-life settings that are interdisciplinary, related to society, 
and related to daily living; 

• learning the independent use of inquiry to identify and solve problems; 

• learning decision-making strategies; 

• developing positive attitudes about science, and teaming about current prob- 
lems that make clear the interdependence of science and technology and their 
relationships to other human enterprises 

Other science educators, following the work of Flaveil (1963), Case (1985), and 
other developmental psychologists who have built on Piaget's work, have suggested 
that science classrooms for young adolescents should be dominated by discussion, 
opportunities for variation in the nature and pace of learning activities, resource 
centers, and direct interaction with objects and events. They propose that the cur- 
ricular emphasis be on linking concrete science experiences with events and 
phenomena familiar to students (Blosser, 1988; ERIC Clearinghouse, 1982). 

More specifically, Rakow (1988:1), following Rowe's{lP78) finding that hands-on 
experiences can positively affect students' sense of control, suggests that hands-on 
science investigations mesh with the need for young adolescents to "become the 
authority rather than the teacher or the textbook." This happens, he claims, when 
students gather data and solve problems. Rakow also suggests that understanding 
the tentative nature of science can help students in ths middle grades become more 
comfortable with alternative hypotheses and multiple solutions. Science education, 
especially if linked with technology education, also affords young adolescents an 
opportunity to study real-world problems and allows them to explore and debate 
issues of immediate concern to them— health, the environment, and energy. Such 
study, Rakow claims, can help shape positive attitudes toward science. 

In specific reference to science education, the Carnegie Task Force on Education 
of Young Adolescents (1989) suggests that health education be integrated into the 
curriculum as an element of the life sciences. Pointing to the Human Biology Pro- 
gram at Stanford University as exemplary, the task force claims that if students learn 
about health in the context of science, they will better understand how their bodies 
grow and function. As a result of this life-science focus, the task force claims, young 
people can come to appreciate the value of a healthy diet and exercise and recognize 
the dangers of illicit drugs, alcohol, and tobacco. However, the report cautions that, 
to be effective, such a curriculum must also train young adolescents in skills that 
will enable them to resist pressures to engage in negative health-related behaviors. 

Beliefs about what types of science programs will be best suited to students in 
the middle grades are widely shared by science curriculum developers and mid- 
dle school advocates. However, as Hum* (1987:29) notes, most recommendations 
to match science programs to the developmental needs of adolescents have been 
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"more rhetorical than conceptual" and. as such, have provided little specific 
guidance for curriculum development. He noies that the "literature on the middle 
school movement is rich in perspectives of what a curriculum should accomplish, 
but not the curriculum Itself {p. 25V Moreover, Hurd reminds us mat, white the 
literature offers a great deal of hope about better programs for young adolescents, 
ra> solid dataestabHsh tteeftectiveness of either the 

or of science programs that match this concept. He concludes "the entire issue of 
science education for the early adolescent remains unresolved There is no end of 
statements aruiconunittw 

wcOTcen^leaoersWp to focus and sustain the essential actions for reform (p 43)." 

Somewhat different from, though not necessarily inconsistent with the perspec- 
tive of meeting the developmental needs of young adolescents is the perspective 
of science educators developing curricula based on cognitive science research. For 
example, Anderson and Roth (198&14) use two criteria to define the ru^ erf sden- 
tific understanding: (li developing "knowledge that is useful for the essential func- 
tions of describing, explaining, preceding, and controlling the world around us. . .{and 
(2) developing] knowledge * hat is conceptually coherent and Integrated with [one's) 
personal knowledge of the world." For many fundamental science concepts, this 
entails that students go through a complex process of conceptual change In the case 
of photosyrrthesis explored by Anderson and Roth, students must reconceive their 
cominon-sense notions of food derived from their cwnexperfences and 
understandings about the different metabolism of plants, as contrasted to that of 
humans and pets, with which they are familiar. Science classrooms generally fall 
to bring about the change that makes it possible .'or students to absorb and under- 
stand a new concept and reintegrate it into their understanding of the world. It is 
for this reason that individuals often . ^vert to "common sense" interpretations of 
phenomena when they are met in non-school contexts, even though the canonical 
scientific explanation was learned in school, lb make possible conceptual change, 
Anderson and Roth argue, requires attention on sense-making, that is, teaching for 
depth of understanding (narrowing and deepening the curriculum), rather than 
breadth of coverage (covering a wide range of content superficially). Of course, cur- 
rent testing of science knowledge, particularly when carried out through widely ad- 
ministered standardized tests, rewards precisely the latter teaching strategy- 
memorizing as many facts and concepts as possible. 

A second requirement for true science learning, according to Anderson and Roth, 
is flexibility and the use of an array of teaching strategies suited to the progress of 
individual students. Curricula and instruction must be engaging and accessible to 
students, yet challenge them, through discrepant information, to work hard and 
think. The teacher's role is not as an expert who provides the right answer, but as 
a model who enters into scientific inquiry and discussion and who coaches students 
to do so as well. Moreover, students need to be encouraged to explain and use their 
newly gained knowledge themselves, finding out in the process something of the 
nature of scientific dialogue, the application of scientific ideas, and the process of 
evaluating one's own and other people's work in science so that valid ideas and ap- 
propriate solutions emerge. Unfortunately, the reality of schools and science inac- 
tion is far from this ideal. 
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What Students Actually Experience 



As Hurd (1987) points out, despite calls for middle schools and science programs 
that personalize knowledge, integrate subject areas, provide exploratory ex- 
periences, build dose personal relationships, and allow for student diversity, few 
specific programs that mate these goals concrete hav been developed. 
Moreover, as we describe below, the more frequently found school organization 
arrangements, science curricula, science training of teachers, and working condi- 
tions in srhools all promote traditional, academically focused science experiences 
for students at the middle level. Few students, it seems, have access to the kinds 
of programs middle school advocates and science educators recommend. 

Organizational Arrangements 

Grade Span. Students at the middle level attend many different types of schools. 
Epstein and Maciver (1989) at the Center for Research on Elementary and Middle 
Schools report that seventh graders may attend schools that include any of thirty 
different grade spans. The largest number of seventh graders attend grade six 
through eight schools (representing a 160-percent Increase in the number of these 
schools since 1970), although in many parts of the country grade seven through 
eight schools are the norm, and in others, grade seven through nine schools are 
common (Alexander and McEwin, 1989). Some surveys indicate that schools with 
grade spans including five through eight or six through eight more often exhibit 
key middle school characteristics (innovative scheduling, cross-disciplinary 
teams of teachers and students, supportive guidance practices) than do schools 
that begin with grade seven (Cawelti, 1988; Epstein and Maciver, 1989). However, 
middle school advocates note that it takes more than a change of grade span to 
create a middle school. And some claim that the marked recent shift in grade span 
is more likely to have been spurred by administrative convenience to alleviate the 
recent overcrowding of elementary schools than by growing interest in the middle 
school concept (Roth man and Cohen 1989). 

Departmentalization. One of the keys to distinguishing the middle school con- 
cept from the more traditional junior high school is the degree to which subject 
areas are housed into distinct departments where students take separate courses 
in each subject from specialist teachers. While many middle schools are moving 
toward interdisciplinary classrooms and block scheduling, distinct daily class 
periods of equal length for each subject remain the norm with, according to one 
recent survey, 66 percent of schools serving seven through ninth graders using 
this type of department-oriented scheduling (Cawelti, 1988). Eighty percent of 
teachers assigned to teach seven through ninth graders are members of a subject- 
area department and teach classes in that subject to intact classes. Almost none 
teach self-contained classrooms covering ail subjects (Center for Research on 
Elementary and Middle Schools, 1987). Only about 16 percent of schools are 
organized into interdisciplinary teams, about half of which include math/science 
teams (Cawelti, 1988). 
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The extent of departmentalization, however, overestimates the percentage of 
students who stay together for part or all of their school day. About equal percen- 
tages of seven through ninth graders stay together tor ail of their subject, regroup 
for one or two subjects, and regroup for all of their subjects (Ce^er for Research on 
Elementary and Middle Schools, 1987). 

Grouping by Ability. The predominant method of grouping students for academic 
instruction in the middle grades is placing them into homogeneous groups by ability. 
In a recent science survey, fewer than a third of the teachers reported that their 
classes included students of widely varying abilities; the rest Judged their classes 
to comprise homogeneous groups of high-, average-, or low-ability students (Weiss, 
1987). Because grouping by ability is often accompanied by differences in cur- 
riculum, teaching, and classroom atmosphere, it can result in uneven science team- 
ing opportunities for middle-level students. And importantly, grouping by ability 
may place poor and rriinority students at a greyer 4isach^ntage, as they are found 
disproportionately in low-ability science classes— those with the most limited 
science learning opportunities (Oakes, 1990). 

With departmeiTlalization and grouping by ability, middle schools appear to mirror 
rather rigid high school organizational practices. Only a small percentage of schools 
have developed schemes that might more easily accommodate the diversity among 
young adolescents and the spurts in growth that individual students may be expected 
to experience. 

The Science Curriculum 

Researchers have documented that the science curriculum for most students at the 
middle level focuses almost exclusively on academic preparation and largely ignores 
other middle school goals such as relating science to everyday life, pressing social 
issues, or the personal concerns of adolescents (Armstrong et al., 1986; Goodlad, 
1984; Hurd et al., 1981; Hunt 1987; Johnston and Aldridge, 1984; Weiss, 1987). For 
example, one study documented that middle school science teachers found such 
goals to be diffused, impractical, remote, and unrealistic," and that the most com- 
monly accepted reason for having young adolescents study science is that they 
should acquire specific information on science topics (Hurd et al., 1981). 

it is not surprising, then, to find traditional content and Instructional modes 
dominating curriculum and instruction at the middle level. Rather than being ex- 
posed to exploratory, integrated science, most students are taught a series of tradi- 
tional science topics. Rather than learning science in connection with other con- 
tent areas, most students at the middle level take a sequence of specialized courses- 
life science, physical science, and earth science— or a series of "general" science 
courses. Lecture, textbook reading, recitation, and tests most frequently characterize 
science instruction (Goodlad, 1984; Hurd et al., 1981; Weiss, 1987). 

Students at the middle level typically use a single text as the source for lessons, 
activities, lectures, and reading assignments, and most texts are but watered-down 
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versions of those used in high school science. Although some demonstrations and 
laboratory work supplement these dominant modes of science instruction, students 
have few opportunities for direct experiences and hands-on activities that engage 
them in doing science (Weiss, 1986). 

Perhaps as a result of a mismatch between the needs and interests of young 
adolescents and the science curriculum, many students appear to find science dif- 
ficult, boring, and irrelevant (Goodiad, 1984). The most recent science assessment 
conducted by the National Assessment of Educational Progress (NAEP) (Mullis and 
Jenkins, 1988) found, for example, that only slightly more than half of the thirteen- 
year-olds said that they thought what they were learning in science was useful in 
everyday life or that they would use science in many ways as an adult. Even more 
discouraging, fewer than half the students questioned thought that the application 
of science could help solve such major social problems as work) starvation (25 
percent), birth delects (34 percent), and reduction of air and water pollution 
(49 percent). Fewer than half (43 percent) thought that science would help them 
earn a living or that science would be important in their life's work (37 percent), 
Prior science assessments have found that nearly three-quarters of the thirteen-year- 
olds found their classes boring, and more than half reported that they did not like 
science and planned to quit taking it as soon as they were free to do so (Hueftle et 
al., 1983). 

While middle-level science programs may rum many students off, these programs 
seem to affect girls more negatively than boys. Information from NAEP and other 
data consistently reveal gender differences in thirteen-year-olds' attitudes toward 
science {Hueftle etal., 1983; Mullis and Jenkins, 1988; Zimmerer and Bennett, 1987). 

Teachers 

The recent report of the Carnegie Task Force on Education of Young Adolescents 
(1989) put it bluntly: "Many teachers of young adolescents today dislike their work. 
Assignment to a middle-grade school is, all too frequently, the last choice of teachers 
who are prepared for elementary and secondary education." If this is right, it may 
help explain why there are shortages of well-qualified science teachers at the mid- 
dle level and why, as states require increasing numbers of science courses for high 
school graduation, this shortage is likely to grow worse. 

In 1986, for example, only 68 percent of science teachers in grades seven through 
nine had taken the number of college courses recommended for middle-level 
science teachers by the National Science Teachers Association or had degrees in 
science or science education. Only 73 percent were certified by their states to teach 
one or more science subjects (Weiss, 1987), indicating that more than a quarter of 
the science teachers in these grades were teaching out of their field. Compounding 
the problem, most of those teachers who are qualified to teach science are likely 
to lack training in working with students in the middle grades (Padilla, 1986). The 
National Middle School Association notes that only twenty-one states offer special 
certification for middle school teachers (Alexander and McEwin. 1989). The status 
and training of teachers of students at the middle level is treated more fully in the 
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Center's companion volume (Loucks-Hotsley et al„ 1990). 

Teachers' Working Conditions 

The working conditions that teachers encounter in their schools exert perhaps as 
great an influence on middle-level science teaching as the teachers' background 
and training. Many teachers of young adolescents find that the departmental struc- 
ture of their schools and the large number of students that they work with each day 
make it difficult to accommodate the personal, social, and intellectual needs of young 
adolescents (Center for Research on Elementary and Middle Schools. 1987). Addi- 
tionally, few teachers in either self-contained or departmental schools have the time 
and resources to explore exemplary programs and practices or to work with other 
teachers in designing programs and lessons that cross traditional disciplinary lines. 

Many middle-level teachers must work in more difficult physical environments 
than teachers of older or younger students. Many junior high school buildings are 
converted high schools-older buildings that might have few of the physical ar- 
rangements conducive to integrated, cross-disciplinary programs (such as, connec- 
ting^ clustered classrooms and space for small-group work). Science teachers face 
particular constraints with school facilities and equipment that are inappropriate 
for inquiry-based, exploratory science activities (Weiss, 1986). In a recent national 
survey, about one-quarter of the teachers of seven through ninth graders reported 
that inadequate facilities, insufficient funds for purchasing equipment and supplies, 
and the lack of materials for individual instruction were serious problems at their 
schools (Weiss, 1987). 

Between Reality and 

the Vision: Some Obstacles ; 

The gap between the vision of what middle-level science might be and the current 
reality is substantial, and it is a gap not easily bridged. Many obstacles stand in the 
way of altering the roles of teachers and the nature of the curriculum, instruction, 
and assessment they provide in classrooms. 

For example, the expectation that teachers can provide for the personal and 
social needs of adolescents presumes that teachers have the knowledge and skills 
to assist students with sensitive issues— issues traditionally in the purview of 
guidance counselors and public health workers. But most teachers have had no 
training in such matters. Nor do many of them relish the task of counseling 
students about such family problems as divorce, child abuse, drugs, teen sexuali- 
ty, and other personal and social matters that weigh heavily on many young 
adolescents. Moreover, the ideal of providing academic instruction and social sup- 
port through closer, less formal relationships in small communities of students 
will require a substantial shift in the school conditions under which teachers and 
students come together. Most teachers now are burdened with too many students, 
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too many bureaucratic demands, and too little time to take on these added respon- 
sibilities. A number of schools that serve young adolescents are moving toward a 
structure that separates students and teachers into small communities or "houses"; 
however, one cannot assume that dose, productive relationships between the 
students and their teachers will necessariJv follow. Most likely, such relationships 
will require a great deal of additional teacher education and substantial changes 
in school cultures and peer-group norms (of both students and teachers). 

Implications for Assessment of Student i^miyig 

Nowhere are the gaps larger and the obstacles greater than in assessment Current 
practices of assessing science learning with "objective," paper-and-pencil in- 
struments focused on the mastery of basic science facts, using individual 
assessments exclusively placing students in individual competition for grades, and 
measuri $ the quality of programs by aggregating students' test scores stand in stark 
contrast to assessment strategies that serve the type of science instruction we and 
others envision for young adolescents. Most paper-and-pencii assessments work 
against a curriculum that engages students with rich and complex science ideas 
and embeds these ideas in real-world problems, Individual, competitive assessments 
make group work seem irrelevant and discourage cooperation. Measuring program 
quality by test scores alone does little to encourage schools to have the resources 
and structures in place that can enable programs to become richer and better suited 
to young adolescents. 

Altering the processes whereby students' progress is measured and programs are 
judged in ways that make assessment compatible with, and indeed serve, the vi- 
sion of an ideal science program for young adolescents will be a formidable task. 
Within classrooms, measures must be developed and used that tap into the students' 
understanding of large concepts and assess their facility in using these concepts to 
solve a variety of problems. This is difficult because St probably requires collecting 
samples of the students' work, recording what the students are thinking as they go 
about giving science explanations or solving problems, and having materials and 
equipment available to use in assessment exercises, Assessments that serve the vi- 
sion of science instruction put forward in the Center's report will also require 
measures of how well groups of students produce science knowledge cooperative- 
ly, in addition to measures of individual contributions to group work. 

Reporting the results of such assessments to parents presents problems, since 
parents are accustomed to having their children's work "summed up" in grades that 
show, not what the child has learned or how, but how well the child compares with 
others. Reporting the results of what we consider appropriate assessments upward— 
to principals, districts, and states— is equally problematic. By Its very nature, repor- 
ting about large groups of students entails undesirable reductionism, whereas the 
assessment strategies we suggest (discussed in greater detail in the next two chapters) 
militate against the type of information that can be easily reduced to chunks of in- 
formation to be aggregated across classrooms, schools, districts, and states. 



50 

Chapter IV 41 



Another issue that needs concerted attention is to ensure that external 
assessments— state and national achievement assessments— correspond to the 
assessments we suggest for classrooms. Unless external assessments come to match 
the new strategies we suggest, it is unlikely these new strategies will take hold In 
classrooms. The pressure to do well on external assessments, particularly when 
these are used to reward or sanction individual schools or teachers, will continue 
to drive the direction of dassroom teaching, learning, and assessment 

Similar barriers stand in the way of more informative and useful assessments of 
program quality, as described in chapter 4 of the Center's report. Assessment in 
Elementary School Science Education (Raizen et al., 1989). It is difficult to develop 
measures of most important "enabling" features of science programs— sufficient, 
high-quality resources; good teachers, curriculum, and instruction; and professional 
conditions for science teaching. Not only are such measures difficult to develop; 
it is hard to bring them to the attention of the public and policy makers who have 
grown used to using test scores as the most important— and often the only— measure 
of program quality, tet without information on program features, policy makers will 
find themselves without much guidance on how to improve student outcomes. 

With all of these difficulties, why should anyone bother? Our vision of science 
at the middle level and its assessment has two important payoffs. First, and by far 
more important, there is the opportunity to help adolescents become critical 
thinkers, in science and in general, a major goal of education. Second, there is the 
potential to use science instruction to overcome the traditional mismatch between 
conventional schooling and the needs of young adolescents. Tb give some reality 
to this potential will require curricula and instruction, and teachers and teaching 
conditions that speak to the possibilities presented by the growing capabilities and 
interests of early adolescents as described in chapter 2 and the goals of science 
education delineated in chapter 3. It will also require assessment strategies that are 
consonant with and support the science education envisioned in this and the 
Center's other two reports on middle-level education. (See Bybee et at., 1990 and 
Loucks-Horsley et al., 1990). # 

In the next two chapters, we discuss such assessment strategies at greater length. 
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Chapter V 

Assessment In Middle-level Science: 
Improving Current Practice 



The scenario on the next page illustrates some of the characteristics that science 
teaching and assessment can foster at the middle level: the emerging ability of 
students to evaluate what they know and do not know, their expanding communica- 
tions skills, their interest in collaborating with peers, and their capacity for 
understanding specialized information (Superintendent's Middle Grade Task Force. 
1987), In this chapter, we consider in greater detail the implications for assessment 
of the developing potential of young adolescents and the goals of science educa- 
tion. Apparently, the needs and capacities of students and teachers at the middle 
level directly match the opportunities presented by classroom environments that 
integrate active science teaming and performance-based assessment (Carnegie Task 
Force on Education of Young Adolescents, 1989). 

As noted in chapter 2, from a developmental perspective, middle-level students 
are working toward engaging in sophisticated types of thinking and reasoning. They 
have a growing capacity to understand the natural world, which is crucial to science 
learning, and because of their increasing realization of connections between self 
and the larger world, they are likely to be interested in applications of what they 
learn to real-world problems. From an assessment perspective, these adolescents 
can see themselves in place of others and interpret what others think about them, 
which is central to self-evaluation. Also, because the locus of control is shifting away 
from the adn'is around them to a desire for personal authority, middle-level students 
can be given more responsibility for self-evaluation. During the years of early 
adolescence, students are striving for increased independence, and this inclination 
can be fostered in a constructive way by giving them more responsibility for direc- 
ting their own teaming and monitoring their own progress. These students are ready 
to benefit from a classroom environment that provides challenging contexts and 
helps them in their search for understanding of the world ? round them. Unfortunate- 
ly, as the review in the preceding chapter indicates, recent studies indicate that most 
teaching, inch-ding science teaching, is instead numbingly dull and disconnected 
from any meaningful context. 

Good science instruction that is also consistent with the inclinations of students 
aged ten through fourteen requires teachers to explore methods of guiding and 
evaluating their students, progress which is different from mere teaching and testing 
for rote learning. Hands-on activities that are so vital to understanding scientific con- 
cepts should be an integral part of these methods Students should be given the 
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£ arty m the month, Ms. Lopez* 
class had vsitedthekxalmuseum 
to ot&rve and dmwprinmte skeletons. 
She hod highlighted many skeletal 
features, pointing out that limbs were 
Ifce levers, and that the skuii, sternum, 
and pelvis protected delicate and im- 
port^ body parts. She had shoum her 
students how vertebrae protected many 
internal organs and supported the 
body, yet allowed for flexibility. Later 
in the month, they had visaed the zoo 
Ms. Lopez wanted her students to 
become aware of the ways primate 
and other animal bodies were alike, 
and how they aWeredfrim each other. 
Over the weeks, as Ms. Lopez and her 
students continued to investigate 
bones, they often discussed fitness, 
health, disease, and ways thai scien- 
tists answer questions by studying 
bones. 

After her class had visited the 
museum and zoo, Ms Lopez wanted 
to assess what her students had hom- 
ed. She set up twelve stations in the 
classroom, and put a bone at each 
one. At one station she placed a 
vertebra of a cow, at another, the rib of 
a mouse, at another, a plastic human 
arm bone. At each station she penciled 



a question or two on a piece of paper, 
which she left next to the bone 

Ms Lopez's entire dassroom was a 
resource cenfer for her students. Bright- 
ly colored tacks pinned sketches of 
bones the studerHs had drawn at the 
museum and posters of animal 
skeletons to the buUetin board On a 
table in front of the bulletin board, a 
plastic human skeleton, no more than 
two feet high, hung suspended from its 
mowTt.fe her studeris answered the 
questions, they often left their stations 
to look at the posters, drawings, and 
the skeleton. Once the students had 
finished, she collected their answers, 
which she would later review 

The next week, Ms Lopez decided 
to introduce an activity in which the 
student would study owl pellets. She 
would use the activity to culminate the 
students investigation of bones and to 
help assess her students' learning. Ms. 
Lopez preceded the adiudy with an ex- 
planation of how owls eat their prey, 
digest the soft body pans, and then 
regurgitate undigested pellets consisting 
of bones, skin, and feathers or fur. She 
challenged her students with a simple 
question: What do owls eat? 

Ms Lopez began the mfbrmation- 



opportunity to use ordinary tools ami to make models using common materials, 
such as wood, paper, plastic, and metal. They should be given experiences with 
using tools and materials to solve problems or answer questions. These experiences 
serve to increase the students confidence that they can care for themselves, and 
they provide practice for such problem-solving techniques as trouble shooting and 
designing experiments. Attempts to fit young adolescents into the restrictive cycle 
of lecture, repeat-after-me instruction, and formal testing lead to frustration on the 
part of both teachers and students ami rail to maximize learning at a critical time 
in the students' academic career. 

Although such a shift in philosophy and methods will undoubtedly be difficult, 
middle-level teachers can also benefit from an activity oriented approach to instruc- 
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gathering phase of the activity. She 
distributed owl pellets, which she had 
obtained from a school science supply 
house, to groups of two students. The 
students carefully pulled the pe'lets 
apart, revealing tiny bones inside. Us- 
fagm identification sheet, the students 
sorted die bones, matched them, and 
speculated as to which animal the 
bones had come from. As the students 
uxwked Ms Lopez moved from station 
to station with her stack of index cards 
and noted the way each pair of 
students went about their work. She 
looked for ways the students classified 
the bones and evaluated how the 
students worked together. She listened 
as they discussed approaches to 
answering the question she hod set for 
drem. 

Once the students had sorted and 
grouped the bones, Ms. Lopez asked 
the students to fasten them to a sheet 
of black paper, laid out so that the 
skeleton of each animal was 
reconstructed as accurately as possible 
in two dimensions. As the students 
worked, she gained insight into how 
the students interpreted what they 
found inside the pellets. She looked to 
see whether they identified patterns, 



noted the size and scale of what they 
had found, and transferred their 
knowledge of the skeletons of larger 
mammab to a new set of mammals. 
She watched for when they rukd out 
hunches refuted or not supported by 
evidence. 

When at lost the students had finish- 
ed, she asked for each pair to decide 
on their answer to her question. What 
do owls eat? On one level, she expected 
such answers as vales and held mice. 
But, as each group showed their data 
to the dass, she began to took for more 
complex scientific thinking. Were the 
students demanding justification for in- 
ferences? Had they questioned whether 
the data were sufficient ,o make 
generalizations? Did the students ask 
new questions? 

As her students discussed each 
presentation, she took nam. Later, she 
added these records to the skeleton 
charts and written summaries of ire 
owls place in the food chain that the 
students had completed earlier. She 
also had their answers to the previous 
weeks questions on individual bones. 
She now had a rich supply of data to 
assess her teaching and her students ' 
teaming. 



Uon and assessment. As we note in chapter 4, compared to elementary school 
teachers, middle-level teachers often are responsible for teaching many classes of 
students, instead of one or two, which mates it difficult for them to assess accurately 
the learning of each student. They also have additional demands on their time, such 
as guidance counseling and managing extra-curricular activities. Thus, an instruc- 
tion and assessment model based on the learner, which gives students more choice 
tor their cwn educational success, can help middle-level teachers manage their time 
more effectively. 

Middle-level teachers should capitalize on the growing independence of their 
students. This independence offers a chance for the teacher to divert energies from 
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managing the dassroom to facilitating teaming. For example, whereas nearly one- 
fourth of the elementary school teachers report having to spend more than three 
hours a week maintaining order and discipline, virtually none of the mlddte-tevel 
teachers report such demands or. their time (Mollis and Jenkins, 1988)-almost 80 
percent of the rniddte-teveJ tochers report spending less than one hour per week 
on these activities. Middle-level students need less direct minute-by-minute guidance 
about "what to do next" than do younger students, and their need for independence 
will benefit from opportunities to direct their own learniiig. 

In fact, among elementary, middle, and high school teachers, middle-level 
teachers have the greatest freedom to build effective evaluation strategies for their 
students: they are released from the constant supervisory activities so prevalent in 
elementary school, they are given increased scope in terms of curriculum and stu- 
dent capabilities, and. for the most part, they are only beginning to feel the pressure 
that grading can exert on the future of individual students. Although in recent years 
grades have become a central issue for students and administrators, the evaluation 
policies of middle-level science teachers still have relatively limited consequences 
for students, careers or plans for higher education. Thus, the middle-level teacher 
is, to a certain extent, free to experiment with a variety of participatory assessment 
strategies to find those that both foster the goals of science learning and meet the 
needs of students in early adolescence. 

Assessment in the Service of Instruction 

Why do teachers need to assess students? The primary reasons ought to include 
concerns for instruction rather than the formal reason of assigning grades. For ex- 
ample, teachers need to assess students' prior knowledge in order to know where 
to begin instruction and to monitor progress to see if certain concepts and dulls have 
been learned, "teachers need to assess what knowledge and skills need to be 
"retaught" and where to go next. 

Reporting to others— parents, school administrators, and the community— about 
students' progress is also part of assessment, and these more formal activities deserve 
attention. However, the bulk of teachers assessment activities ought to relate directly 
to providing appropriate instruction, and should not be separated from instruction, 
but integral to it. In fact, good instructional tasks and good assessment tasks should 
be indistinguishable, as the example on page 44-46 illustrates. 

Construction rather than instruction. Although individualized instruction and 
evaluation are held out as an ideal, the pressures of class size and multiple- 
preparations may render them infeasible. Middle-level teachers frequently are faced 
with the demand of teaching many more classes than teachers in the elementary 
grades, and "getting to know" so many students is often difficult enough without 
trying to tailor instruction and measure its effectiveness on a one-to-one basis. Thus 
efficiency in assessment must also be a priority for the middle-level teacher. The 
inclination of students in the middle school to strive for increased independence 
can be fostered constructively by giving them more responsibility for directing their 
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ftjf* Lopezes class has been studying 
1 VI primates as the basis for learning 
about communication and behavior. 
Ms Lopez has found that a trip to the 
zoo to obsewe monkeys two resident 
gorillas, and two pair of orangutans is a 
good way to introduce thinking about 
the advantages of group living and 
soda/ interactions from a somewhat 
removed position. Her middle-school 
students find primates interesting and 
funny, so different from human beings 
yet eerily familiar, This helps Ms, Lopez 
engage her doss in this animal study. 
These students can distinguish behaviors 
in other species that have parallels in 
their own lives while beginning to 
understand the perils of anthropomor- 
phism. By learning about— perhaps- 
less complex primates, they can be$n to 
develop a language and conceptual 
bam for reflection on their own relation- 
ships and interactions. After the zoo visit, 
the students have worked collectively to 
interpret their observations. A list of 
biological and social functions is begin- 
ning to emerge from the observed 
behaviors* which contains such notions 
as communication, caring for young, 
protecting themselves* social and fami- 
ly groups, and getting hod. In addition 
to learning about primates, students 
have been practicing their skills, in- 
eluding observation, recording data, and 
interpretation. 

Ms. Lopez has emphasized six 
characteristics of primates that they 
share and that differentiate them from 
other groups of animals: grasping with 
fingers and/or toes, the opposable 
thumb, finger and toe tips for touching, 
great capacity for "thinking," stereo- 
scopic vision and seeing in color, and 
group living. Afo Lopez wants to focus cm 
the tost: group dynamics and the advan- 
tages of group living—safety in numbers, 



cooperation in obtaining food and car- 
ing for young, and establishing bonds 
and aid in the health of the group 
through such interactions as grooming. 

Haw ctmMtL Lopezas* **erstu~ 
dents' tearningemduj* ^standing? 

She wans her students to make use of 
information-gathering and problem- 
solving sk&s, as weO as to exercise habits 
of mind that characterize the doing of 
science. She asks them to (J) select a 
behavior or set of behaviors from their 
journal observations that they found par- 
ticularly interesting. Next they are to (2) 
generate ideas about why these 
behaviors might be useful to this type of 
primate. They are to then (3) set the task 
of researching information on the type of 
primate observed, seeking evidence that 
will help to (4) decide which of their own 
ideas am supported and which are ratf. 

Each group of students will (5) report 
to their classmates on the following 
topics: 

• This is the behavior we found 
interesting. 

• These are some ideas we have 
about the benefit of the behavior to 
the primate group 

• Here is what we learned from 
research to support our own 
thinking. 

• Here is what we learned that does 
not support our ideas. 

• This is a summary. 

In addition to homing about primates 
and having her students develop crucial 
thinking skills, Ms. Lopez wants to rein- 
force some of the larger concepts and 
themes underlying her science instruc- 
tion: diversity and variation; patterns, 
rhythms, and cycles; and models and 
theories. Thus, to explore under ranges 
of animal interaction and strategies for 
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sunm&lcmd to emphasize some of these 
larger concepts, she pwposes that the 
class form small groups to study the 
behavior of other animals and insects. 
In this instance, Ms* Lopez asks the 
students to choose their owngroups>and 
it docml take before the groups are 
formed and begm suggesting species to 
study. Rosemarys group wards to ham 
about dolphins and Joling's about 
elephants. Tb ensure diversity Ms, Lopez 
suggests that the other two groups study 
ducks and honeybees.Each group utfH 
conduct research on existing informa- 
tion, including v^its to the tibrury and in- 
terviews with relatives mid neighbors 
u>ho might have special information Us* 
mg their experience u&h primates as the 
foundation of their study, they are ask* 
ed to focus on how their chosen species 
communicates, protects dseff, cam for 
its young gas food and organizes itseff 
socially and in family units. Theimates 



w$ be assembled for eventual review by 
Ms. Lopez, with the final goal an oral 
presentation by one of the group 
members to the entire class. The 
presenter ititf be selected fiy a random 
method. As an aid to the presentation, 
each group is admed to prepare a poster 
depicting the tey points of its findings 
After the presentations, the doss will 
discuss the similarities and differences 
among groups and make generalize* 
tions abort the behaviors of Bu^things 
Ms. Lopez* instruction is designed to 
lead students to increase their science 
kmnvtedge and skills, but at the same 
time she is systematically collecting 
evidence of their learning through tlx 
notes, the posters, the orot presentation, 
and the quality of the group discussion 
FUrther, students are building their sociai 
ami communications skills and being 
gioen responsibility for the success of 
jtheir own group 



own learning and monitoring their own progress. Tfeachers will use their time more 
effectively if they share the burden of evaluation with their students and utilize the 
students developing abilities in the areas of peer-and self-evaluation. 

laboratory activities and group projects offer opportunities for efficiently 
evaluating students' learning, while also affording students a chance to work in- 
dependently and to collaborate with their peers. When the challenges are appro- 
priate to the manual skills, available materials, ami stage in development erf critical 
thinking skills, middle-level students sue able to use tools and go through the pro- 
cess of solving technological problems. Furthermore, students erf this age are apt 
to find such active learning of interest, and problems of technology often are effec- 
tive vehicles for developing critical thinking skills. 

Students should be encouraged to ask questions, and teachers can use a variety 
of follow-up procedure* to foster such inquiry. As illustrated in the example at the 
opening of this chapter, students can be asked to work in groups to answer each 
others questions, or the teacher can periodically place interesting discussion ques- 
tions in a log. When students display interest in a particular topic, the teacher can 
ask them to work in this area outside of school and to present their findings to the 
class. 

Tfeachers can assign more structured tasks and long-term projects in which 
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students are responsible for various phases along the way, such as collecting and 
interpreting data, reporting their findings, and formulating a researchabie question. 
Rather than evaluating the results of each phase independently, teachers can ask 
their students to proceed on their own, only seeking help when they need it. Alter- 
natively, the students can share their pnigrenpenodk^wkhdassfiiatestoreoeive 
suggestions about areas in need of improvement or more in-depth treatment. 

Because it is difficult to monitor the daily progress of truty to foitystiidents work- 
tag on laboratory activities or project a teacher can ask the students to keep notes 
on their investigations, and detail their procedures, findings, and interpretations. 
These accounts need not adhere to the formal style of laboratory notes, but should 

go beyond simple narrative and take on some of me chara^ 
tiffc reporting. 

These notes and reports of scientific investigations can be assembled into port- 
folios and used in teacher-student conferences to discuss the student's growth in 
science knowledge, in understanding the principles of scientific inquiry, and in the 
ways in which these can be applied. For example, students need to hone their 
understanding of the differences between observation and inference and 
speculation. 

If teachers are interested in more frequent evaluations of their students' progress, 
notes and written descriptions of work underway can be turned in on a daily basis 
and quickly reviewed— not to mark with a red pencil, but to evaluate students' pro- 
gress and offer feedback as quickly as possible. 

By assembling portfolios, students will also have the satisfaction of recording their 
learning for easy reference and review. The portfolio can give these middie-level 
students a sense of independent accomplishment and, if the journals are kept by 
groups of students, this learning activity can also take on a collaborative aspect. 

A aaeta men t and cooperative teaming. Using cooperative group learning 
strategies in science instruction and assessment is pedagogicatiy sound and prac- 
tical Not only do such procedures include efficient peer-evaluation techniques, but 
students working together in small groups helps to foster positive, academically 
oriented peer-group norms. Cooperative learning gives students greater oppor- 
tunities and incentives to articulate and communicate their understandings. By 
working in groups, students learn cooperation and build interpersonal skills of in- 
trinsic value. Cooperative ami group work offer a practical alternative to individual 
laboratory work, simplifying logistics and reducing the amount of science equip- 
ment required. Cooperative group work has special value for young adolescents as 
it turns their natural inclination for peer interaction toward a constructive end. 

Cooperative learning approaches, however, usually force teachers to set aside 
some of their assumptions about instruction and assessment, and they must accept 
and nurture a different sort of classroom climate In these situations, the most im- 
portant decision-makers are the students themselves. When students work together, 
it may be difficult for teachers to quantify learning outcomes, gauge the success of 



AA r Nguyen's class has been study- 
1VM ing simple machines. He intro- 
duced this activity by having students 
identify, draw and explain simple 
machines used in their homes. As a 



construction of an article that does 
something using a combination of two 
or more simple machines. He has sup- 
plied his classroom with wood, card- 
board, glue, saws, nous, screws, wire 
poBeys, rope, etc Students who need ad- 
ditional materials put a list on the board, 
and Mr. Nguyen tries to have them by the 
next class period Some students are 
working outside of class as well. Mr. 
Nguyen is interested in watching his 
students to see how they move from an 
original idea to a final product. He asks 
them to keep journals that will chart the 
evolution of this process. He plans to 
videotape each student or pair of 
students presenting the process and pro- 
duct for the camera. The journals, the 
report on camera, the product, and his 
observation of his students at work will 
enable him to assess many of the skills, 
attitudes, and ways of thinking that he 
is trying to foster. 
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//? her science journal, Sally entered the following notes made during her trip to 
the zoo; 



Observations 

Smonkeys, 2 are much 
smaller than the rest 

A small one clings to a 
large arte 

Observations 

Another small one keeps 
darting to the food pan 
and then back to the same 
large monkey 

One large one sits in a 
corner with As face 
turned away. 

As Ms* Lopez checks Sally s entry, she 
asks her how she might find wit some 
answers to her questions. Sally decides 
she will check with her partner about his 
observations and ideas. Then she mil 
not only read about Columbus monkeys 
but also about some different monkey 
species to see whether the behaviors she 
has observed characterize monkeys in 
general or only Columbus monkeys. She 
will record what she found out m her 
science journal She also plans to go 
back to the zoo to observe additional 
monkey behaviors, including those of 



Thoughts and Ideas 

Hvo are babies 



& this a baby with &s 
mother? Who takes care 
of the babies? 

Thoughts and ideas 

Is this a baby learning to 
get its own food? 



Do all the monkeys behave 
this way sometimes or is 
this behavior unusuaP 

some very unusual spider monkeys 
She might ask Jos, who can draw 
very well, to go with her and sketch 
some of the monkey behaviors, 
because his pictures sometimes tell 
mote than just words. When she &>es 
to visit her grandparents in San Diego 
for Christmas, she mil make a trip to 
the large zoo there to observe and 
record the behaviors of additional 
monkey specks. She thinks how in- 
teresting it might be to become some- 
one like Jane Goodall whom she has 
seen in a documentary on television. 



students' interactions, assign marks or grades to individuals, or ensuie that all 
members of each group are participating and learning. Students will have to come 
to understand their collective responsibility to work and learn together and to 
assume some of the burden for informal, individual assessment of learning 
outcomes, 

The experienced teacher understands the significance of these new respon- 
sibilities. Although adolescents should be developing a strong sense of their own 
persona] strengths and weaknesses through the comments of their teachers and 
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classmates and through their internal criteria for self-evaluation, they are likely to 
need specific instruction in how to work together and assume collective respon- 
sibility. They must learn to plan together, to apportion tasks and duties among 
themselves, and to involve all members of the group. At Hs best, cooperative group 
work leads the students to discover that each has different talents and interests, and 
each has special knowledge and experience that can be brought to bear. After 
carefully preparing students for group work, the teacher must assume a different 
role—that of facilitator, rather than information source. 

Of course, the teacher must still monitor individual learning outcomes as well 
as group processes in order to evaluate the progress of each student. Some specific 
techniques provide such monitoring and encourage student responsibility at the 
same time. For example, the teacher might announce that one member of the group 
will be selected to report on the group's procedures and findings, and some or all 
of every group member's grade will be based on the quality of that presentation. 
Another approach might be to give one member of the group the specific respon- 
sibility of ensuring that everyone understands the concepts involved, that everyone 
helps carry out the task, and that everyone demonstrates understanding by par- 
ticipating in the group's deliberations. 

Cooperative learning groups, composed of students with complementary skills, 
can provide a supportive environment for science learning and offer an efficient 
way to assess students' progress, By giving group members collective responsibili- 
ty for ensuring that each member of the group has completed a particular assign- 
ment, teachers can ease their own burden in a way that is compatible with the needs 
of middle-level students. Whether the group's assignment is to learn a new concept, 
understand how to control for vanables in a complex experiment, or take account 
of critical factors in the design of an application, collaborative learning and peer 
assessment can help to strengthen students' interest, provide less able students a 
chance to learn from their classmates, and reinforce understanding and skills as 
students review materials to check their own progress. 

It is during the middle-level years that differences in science achievement and 
perceptions toward science become solidified for girls and minority students. For 
example, the NAEP science data (Muilis and Jenkins, 1988) indicate that, while 
average science proficiency for third-grade boys and girls was approximately the 
same, except in the physical sciences, a performance gap was evident by the seventh 
grade and increased by high school. Cooperative teaming groups in which students 
have an opportunity to display a variety of talents provide a richer context and 
greater promise for girls and minority students to develop their own self- confidence 
in scientific endeavors. In addition, the support of their peers may help increase 
their science knowledge and skills, and working on science in a social atmosphere 
may enhance its appeal as an interesting, worthwhile endeavor. 

Scientific inquiry as assessment. A project-oriented approach to science learn- 
ing provides ways to integrate assessment techniques with activities in ways that 
reinforce and extend learning, while also gathering useful information on students' 
progress. As middle-level students develop new abilities and interests, and as the 
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curriculum becomes more challenging, teachers are offered rich opportunities to 
explore different assessment techniques and to find those that are best suited to the 
diverse needs of their students. Vfet, over half the middle4evdscfcnc€teacha8 report 
spending one or more hours per week in a typical science dass administering short- 
answer tests or quizzes; considering that the total amount of time the bulk of the 
teachers (70 percent) report spending on scteioeiitstnicifon each week feonly three 
or four hours (Mullis and Jenkins, 1988), this is particularly distressing. 

;t is true that some things are easier to assess than others, and that paper-and- 
pent3 tests and quizzes appear a quick way to test factual and cortceptuai knowledge 
Yet the ratio of testing to Instruction reported by middle-level teachers seems inor- 
dinateJy nigh for what can be learned from usirigpaper-arKH)erKU methods to assess 
the prore^ and thinking objectives that are at the heart of the scientific method. 
Scientific reasoning, observation, and experimentation must be grounded in a non- 
trivial know. edge base of the phenomena investigated, not in the end-of-chapter 
quizzes found in most science textbooks. 

Middle-level science teachers should seek out ways to assess the application of 
scientific processes in the context of the learning units they create, but following 
the textbook and simply doing activities does not necessarily offer appropriate op- 
portunities to do so. Unfortunately, activities in science das-es often have little to 
do with real science Textbooks publishers unintentionally tend to portray science 
as a static body of facts to be recalled and rules to be applied in solving artificial 
problems; laboratory exercises all too often entail fixed, step-by-step procedures that 
attain results known in advance. Students seldom are asked to give Justifications 
for their answers, explain how their experimental procedures and findings support 
their inferences, demonstrate that their designs serve the intended functions, or 
oiherwise make their reasoning explicit. Science teaching must cross over from tdl- 
ing students about sdence and technology to having them in some small way par- 
ticipate in sdence and develop pieces of technology. The scenario on pages 55-56 
illustrates the integration of instruction and assessment through an extended unit 
that addresses several of the points we make in this chapter. 

lb meet the requirements of recording grades or providing written documenta- 
tion of each student's LTdivkkial progress, teachers can assign meaningful, individual 
tasks on which to base their evaluations of each student's performance— for exam- 
ple, designing and building models, conceiving of and conducting experiments, 
carrying out demonstrations for the class, or conducting sophisticated oral or writ- 
ten presentations on the development and results of their investigations. The 
assignments can fit into larger group or dass investigations or he complete efforts 
on their own. In dther case, each individual assignment will yield a product or record 
of student achievement that can be evaluated, and each will represent a much more 
significant accomplishment than a perfect score on a limited paper-and-pendl test. 

At this point, the reader might agree that assessment needs to serve classroom 
instruction, but, nevertheless, question the applicability of the suggested assess- 
ment strategies tor large-scale assessments. However, even short-term assessment 
exercises— whether for use in a single classroom or in large-scale assessments- 
should indude hands-on performance tasks that allow students to demonstrate trwnr 
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Rjfz Washington s treatment of the 
1 VI topic of rivers emphasized energy 
relationships hi the generation of elec- 
tricity the problems of the design and 
construction of dams, and in such 
geological processes as sediment 
tmnsportatkm. Her assessment plan for 
theumton rivers had multiple objectives: 
(I) to determine how well the students 
undemoodtheenergyretotionship;(2)to 
assess Ifte students acquisition of formal 
thinking stalls; (3) to assess how weft the 
students were developing skilis in self- 
assessment of their own learning: and 
(4) to assess acquisition of the social and 
communication skills for working in 
groups. 

Ms. Washington presented the assess- 
ment exercise to the class in the form of 
a 'wnatiT' question. But first, she review- 
ed with the class the interactions of the 
water in the river with objects, sub- 
stances, and living things encountered 
along its path of ffow, focusing on energy 
transfer and conservation in these inter- 
actions. Then she asked the students to 
think about what would happen if their 
city decided to remove the power dam. 
Would it be possible to return the river to 
its original state? What would be the con- 
sequences of removing the dam on the 
focal river? With Ms. Lopezs help they 
could form groups to consider how their 
observations of the river would have 
been deferent if a single physical proper- 
ty of the water were difkrentPspecihcal- 
ly, to density. What if water were as 
dense as mercury, how would the river 
be different? Each student thougltt about 
the question overnight and came to class 
the next day with a list of possible dif- 
ferences and some indication of which 
differences he or she was particularly 



interested in investigating. With Ms 
Washington's help, thesiudents categor- 
ized thefr idem and rediiced the long 6st 
tr^ had generated to several questions, 
which they used m forming groups to in- 
vestigate individual questions. 

Afkr brfeffy m^ieu^ng the class pro- 
cedures forworkingm groups on extend- 
ed projects, which inchrded the impor- 
tance of planning and keeping records, 
the groups began work on their in- 
dividual questions. One group formed to 
investigate how the dam affected the 
movement of sediment down the river. 
Had the dam trapped sediment behind 
it? What were the consequences of the 
trapping? After developing a preliminary 
plan for investigating these questions, 
this grmp began its work by (I) building 
a small-scale stream, (2) seeing how 
water velocity changed as the stream 
entered the reservoir, and (J) teaming 
about how fast the water has to travel in 
order to move the sediment. 

Another group was interested in what 
ux>uld be done with all the concrete that 
had been used to build the dam. They 
began their study by asking how 
materiab are recycled and whether old 
concrete is useful. They learned that 
some chemical reactions are not rever- 
sible. They wen! on to team about the 
technology of demolition and did some 
brainstorming on new uses for old 
concrete. 

A third group of students was par- 
ticularly interested in how the old reser- 
voir site couid be restored to something 
like its original condSions. In the process 
of planning they discovered that restora- 
tion ecology is an important new area of 
science, 

A fourth group investigated the prob- 
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tern of alternative energy sources. If the 
dam was no longer used for ekctrk 
power, where wouki the growing com- 
nnmity get energy? In their pkmningthey 
discovered that systems analysis would 
helpthemunderskmdthekmjerpwblem. 

They did several experiments that show- 
edmatpeopktendtouxisfeeJectricdyand 
that energy conservation is an untapped 
"source'* of energy. 

As the students worked on their prob- 
lems, Ms. Washington circulated about the 
classroom, offering information and en- 
couragement, rewarding desirable 
behavior, and making systematic obser- 
vations. She augmented her observations 
with information about her students pn> 
gressobtainedfnvn their weekly progress 
reponstotheckissandfKmthekjoumak 

When ptorming mis assessment aetk>i- 
ty, Ms. Washington developed several 
observation sheets organized by her tor 
objectives for the river unit. These would 
gusher observations, whkh she could 
record systematically. She focused on me 
following: 

• Acquisition of subject-matter 
understanding. Because er.~gy 
was a subject-matter focus for tms 
unit, Ms. Washington listed the 
formsofenergyandenergypartkks 
that the class had studied on her 
observation guide. Since her objec- 
tive was to assess the students' 
development of understanding, she 
noted instances of the appfcatioriof 
these concepts and principles, 
sometimes recording the students 
statements (both correct and incor- 
rect) about the forms of energy and 
energy principles. 

• Acquisition of formal opera- 
tional thinking. On another 
observation guide, Ms. Washington 
had fisted aspects of formal opera- 
tional thinking to guide her observa- 



tions of her students cognitive 
development. For example, the 
creation of the scnte model of the 
stream avowed her to assess how 
we& ter students were learning to 
reason proportionally. The river 
assessment exeitsse pmukkdmrny 
opportunities to observe the 
SiMientsc^^^k^^vam^s, 
to test which were relevant, to 
hypothesize, and to design tests of 
their hypotheses. 
MsWashtogw 
for instances <rf reflective thinking, in* 
skmces uj^i ttw ^^s were k^ig 
{Acnahmviheywete^^^tAmMMr 
problem or when they were identifying 
^M^fmHxssest^^^htkl expired 
tMa^fmwed<^m^ffi$(meway^te 
c^wasux^ct^ft^ff^swesuimit^ 
students commeniedon the extent of tftw 
wHt^mmt^^aa^K^wf^^^ 
when a student noted how Ms or her 
undemanding tod m^mvedorwasa^ 
ed^(p^hnby^adc&kmofrmvffb 
formation 77m &oup exercise also enabL 
ed Ms. Washington to observe her 
students* sttiMs in viewing explanations 
from different perspectives as they 
challenged each other's ideas and 
observations. 
• Acquisition of skills in self* 
assessment Se&assessment skills 
depend on the capacity to think 
re^ctioety.A^n^xnHtS^en^ 
is that self-assessment requires 
students to judge their own 
understanding against a standard, 
in the case of the river unit, # was 
the standard set by Ms Washing- 
ton's expectations Ms Washington 
intended to be alert to how weB the 
students understood her expecta- 
tions and their ability to assess their 
performance against her expec- 
tations. 
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• Acquisition ofskUIs frt group 
process. Ms. Washington noted 
several behaviors in her observa- 
tion guide she would track as 
groups proceeded in their work 
listening to the ideas of others, 
challenging them appropriately 
and considering the evidence 
presented, putting forward one's 
own ideas deo.*y and defending 
them while considering counter- 
vailing evidence, being witimg to 
put aside one's mm approach in 
favor of a mom promising one pro- 



posed by others. She a&o recorded 
progress in ora/ and written com- 
munication made by the students, 
the latter through their journal en- 
tries, laboratory notes, notes on 
background reading they had 
done, and records oftheirptorming 
and procedures. As groups re* 
ported on their progress to the rest 
of the class, she noted the qua Sty 
0/ their presentation and visual 
aids and also the responses efia&rf 
from the audience of their fellow 
sfucferf& 



proficiencies in laboratory and science-thinking skilk One step in the right direc- 
tion is to pose a problem tot students to solve in a limited amount of time, using 
equipment that permits alternative solutions. (A larger, more difficult step is to in* 
volve students in finding their own questions to pose and investigate about some 
phenomenon, which makes life more complex for the teacher.) In one such prob- 
lem, based on an exercise first developed in Great Britain (Assessment of Perform- 
ance Unit, 1984-85) and that has been adapted for large-scale testing, the students 
are asked to determine which of three brands of paper towels holds or soaks up the 
most water. They have available samples of the three kinds of towels, beakers, a 
scale, a pitcher of water, and other materials. Appropriate solutions include 
saturating and then weighing the towels; saturating the towels and then wringing 
them out and measuring the water released; and soaking up as much as possible 
of a known quantity of water, then seeing how much of the water is left. 

Grading for such an activity might be based simply on whether the students were 
able to arrive at an acceptable solution, or the solutions might be rated as to ade- 
quacy and sophistication, with more or less credit given, depending on the solu- 
tion employed. Another such problem, "Survival," was orifpnally developed in Great 
Britain ami was adopted by the National Assessment for Educational Progress (1967) 
for its pilot study on using performance tests for assessing higher order thinking. 
Students are given two or more different kinds of materials and fasteners and scissors, 
aluminum and plastic cans of several sizes (which they can fill with hot or raid water 
to stimulate persons), a fan (to stimulate wind), and various measuring devices— 
thermometers, a ruler, a stopwatch, and graduated cylinder* They arc to determine 
which of the fabrics would keep them warmer on a mountainside on a cold, dry, 
windy day. In this exercise, students need to identify the variables to be manipulated, 
controlled, and accurately measured and recorded. They need to be able to draw 
a reasonable conclusion from their data and justify it. 

in both these instances, the teacher should ask the students to explain how their 
conclusions followed from their procedures and observations, and the teacher could 
rate the quality of their explanations. Accuracy of the procedural and measurement 
techniques might be rated separately, as might be the adc*H*cy of the records kept. 
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In short, when using scientific inquiry as an assessment approach, the teacher 
should emphasire both the approach and the product, how a student obtained an 
answer or carried out a hands-on activity, as well as the "appropriateness" of the 
answer or performance. 

For the teacher's purposes, such performance assessments can form a natural part 
of instruction. The solutions students propose can be discussed as they occur and 
evaluated for logic and rigor. The teacher can discover much about each student's 
strengths ami weaknesses that might be relevant to subsequent instruction. 

Using knowledge and skills. The principle of use and application in a wider 
context (sometimes called transfer) is critical to the assessment erf science knowledge 
and process objectives. In order to find out whether students can apply what ;ney 
have learned in new context* teachers may need to deliberately avoid asking some 
questions or giving some examples in their classroom instruction and reserve these 
materials for assessment questions exercises, or special projects that will yield 
evaluate information.^ students to analyze 

a poem they have not seen before, the sdence teacher might ask students to discuss 
the special environmental adaptations of an organism they have not studied, or ask 
them to predict the chemical properties of a compound with which they have not 
worked. 

h is often possible to imagine a continuum of problems, increasingly dissimilar 
from those actually studied 11 students have studied the motion of frictionless pucks, 
a question about billiard balls and bumpers would introduce few new elements— 
but, a problem stout the collision of objects in three-dimensional space might in- 
volve significant transfer. Still more remote would be problems involving the 
simultaneous interaction of more than two bodies in space. In contrast to giving 
the students problems, the teacher could ask them to think up their own questions 
about applications of friction. Both strategies would $ve teachers a good idea about 
where the students might fail at different points along the continuum of being able 
to apply their science knowledge and skills to unfamiliar problems and situations. 

The nature of this continuum in any particular case would depend on the kinds 
of activities the students had engaged in and the instructional goals of those ac- 
tivities, The basic principle, in any case, would be that the difficulty would be deter- 
mined by the number of common versus dissimilar elements in the instructional 
versus assessment situation, where dement is loosely defined- For example, in an 
assessment exercise about a situation including only a few new dements* the 
students mi^it be asked to adapt solution procedures previously learned and arrive 
at precise answers. For more novel situations, they might be asked to estimate or 
roughly describe what might happen. In the example given above about friction 
in three- dimensional space, the students probably could do no more than to in- 
dicate the nature and direction of changes that friction would introduce and, 
perhaps, to justify thdr answers. For the most unfamiliar problems and contexts, 
the expectation might be that students could indicate what basic principles would 
still be expected to hold, for example, the conservation of energy and momentum 
or the balancing of forces. 
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Broadening the definition of what counts as assessment. Teachers can col- 
led ^anl evidence" of performance in many ways, wtthoutadir' istering a single 
tes! or quiz. Tb colled evidence of learning in these more constructive and mean- 
ingful ways, middle-level teachers must learn how to rely on their judgment and 
powers of observation Teachers should question why. for one or another student, 
their subjective assessment of achievement might not agree with the test-score 
averages maintained for that student in their grade books, This may be because the 
tests teachers give on whfch they base snic^* grades are rrotrefinedOTromp^en- 
shre enough to reflect the numerous and complex facets of student learning. 

Much—probably most— of the information teachers use to guide their instruc- 
tional decision-making comes, indeed should come, not from formal tests, but from 
informal dassroom observations. Although multiple-choice and short-answer test* 
and quizzes can go beyond measuring facts and convergent thinking, such in- 
struments are very difficult to develop, even by professional measurement experts 
End-of-chapter questions and multiple-choice and short- response teste can provide 
a measure of the students' surface understanding of processes and their ability to 
define terms. However, over-reliance on these tools can lead teachers to overlook 
other highly valued aspects of learning, such as the inclination to question, the abili- 
ty to transfer learning to new situations, and the capability to analyze complex 
interrelationships, 

Good Instruction requires that teachers be sensitive to their students most sub- 
tle signs of progress. Monitoring facial expressions for flickers of understanding or 
puzzled looks is perhaps an obvious example of one way for a teacher to "tell how 
1 am doing." Viewed differently, it is also a way for teachers to tell how their students 
are doing. 

There are ways of doing such observations better and, at the same time, increas- 
ing the credibility of these observations as a source of information for marking and 
grading, and communicating with parents. First, such observations should be 
somewhat systematic— done on a regular basis. Teachers might carry around a 
packet of index cards to jot down observations on what particular students do from 
time to time. They might spend a few minutes at the end of each day (at the very 
least, every few days) to file these observations for future retrieval. Teachers should 
be alerted to the human tendency to note the atypical and negled the commonplace 
(Almy and Genishi, 1979). Routine observations are of value Also, it is important 
that informal observations systematically cover all the students in the classroom. 
In short, teachers should be sdentific observers. Informal observations on the face 
of it seem more valid— less artificial and contrived— than more formal, written 
measures. However, unfortunately, they may also, on the face of it, appear less 
reliable and lacking in the comfort provided by a row of 'objective" scores in the 
grading book. Reliability comes through standardization and replication. Multiple- 
choice tests are reliable, in part, because they are standardized across learners, but 
also in part because they involve the summation of many independent responses 
to a variety of questions. The same principle can be used to enhance the reliability 
and the status of informal observations, so that they will provide the same security 
under fire as the row of test scores. 
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Reliability can be increased by aggregating the results of observations over multi- 
ple occasions. Validity will be highest when multiple observations also involve a 
degree of '"convergence" or 'Iriangulatkm,*' a synthesis of evidence from different 
contexts, applying different modes of representation. Although the informal obser- 
vations themselves do not yield tangible pieces of evidence that can be put on the 
bulletin board or sent home to parents, documentation of these evaluations for 
students together with the behaviors that support the evaluations can he very 
valuable and powerful. Parents and administrators, as well as teachers themselves, 
should remember that these assessments are individualized and made by the per- 
son in the best position to make the Judgments. Teachers should not feel uncom- 
fortable with the idea that their perceptions count as assessment. Such written 
records can provide a way to chart the progress of individual students and the dass 
as a whole. 

As they refine their monitoring skills, teachers can listen to and reflect on the ques- 
tions that students ask, not just to provide an answer, but to evaluate what such 
a question could mean from the perspective of students' misconceptions or 
misunderstandings. Interpreting conversations among students in cooperative learn- 
ing situations and using this information to make determinations about the quali- 
ty of their understanding can also be an illuminating strategy and provide the basis 
for reiterating material with greater specificity or in a new context. Simply observing 
students as they work on projects can be another very useful way to assess their 
learning, If students are having difficulty applying their science knowledge and 
understandings to hands-on situations, this may indicate that they have an inade- 
quate or flawed understanding of the underlying concepts. 

A caution is in order here. Rote hands-on activities are no better than rote 
memorization. It is entirely possible for students to perform the steps of an experi- 
ment or to follow instructions that lead them through an activity without having 
even a glimmer about the purpose of the experiment or project. Teachers, therefore, 
must be aware of the concepts or deep learning they are trying to foster and con- 
tinuously check each student's understanding against these goals. Simply asking 
students if they know the scientific principles involved in an activity can be reveal- 
ing. Beyond that, asking them to apply those principles to a new situation might 
yield interesting and informative replies. 

Teachers have many ways to assess students by using their considerable powers 
of observation and confirming these observations through questioning and dialogue. 
Documentation of these assessments will provide useful and valid evidence of 
students teaming, or the lack thereof. Teachers should recognize, however, that such 
assessment strategies can raise more questions man they answer. For one thing, 
for some students, problems with language and inability to express themselves dear- 
ly may be confounded with problems in science understanding. For another thins, 
in noting a particular student's misconceptions or Jack of understanding, a tc jcher 
might wonder if that student has other misconceptions in related areas and also 
need to begin to explore these. 
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However, as middle-level teachers begin to merge instruction ami assessment— 
watching how Rodents approach and perform tasks, monftoring the questions they 
ask* and evaluating the final products of sustained learning— they will have ample 
information to assign a deJensible grade Mote importantly, they will have developed 
assessment procedures that facilitate their instruction ami respond in a more com* 
prehensive way to the needs and abilities of their students. 



Taking mewnc rt eat of the clawroom, Out-of-school projects and 
homework can be a very productive a/id efficient way to monitor the students' pro- 
gress Asking the students to answer a question in depth or to conduct activities 
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Mr: Nguyen has introduced his etc a 
to simple machines, giving them 
experiences and challenges with inctin* 
edpkmes, fevers, puBeys, wheels ami ax- 
les* Eventually he wants his students to 
construct some machines of their own. 
Before he goes or? to this part of the unit, 
however, he wants to empftasize the 
usefulness of machines in everyday life. 
He has found that drawing can be a 
powerful too/ for assessing understan- 
ding. He assigns a piece of homework 
asking the students to find one or two 
household objects that are simple 
machines ami draw the object, and to 
name the simple machines. When the 
students hand in their homework the 
next day, Mr Nguyen is excited. Looking 
at the drawings, he is pleased that the 
students are able to identify and name 
simple machines. He is struck by the care 
and accuracy of the dmwings, which 
suggest to him that the students were in- 
vested in this assignment, that the 
school-real kfe connections have been 
made. He puts the homework drawings 
on the walls of his classroom and notes 
the pride with which the class looks //rem 
over. The students are reporting finding 
simple machines everywhere— flag- 
poles, road machinery, kitchen gadgets, 
rvofs, and rumps He speculates that the 
students ' feeling of competence and the 
connection between school work and 
the real world have enhanced a disposi- 
tion to acquire additional knowledge. 
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a! home and write up their results for review conserves classroom time and 
resources, while at the same time giving teachers the opportunity to gauge the 
students' learning. (An extended example of this strategy is given in the next 
chapter.) Many avenues of assessment open to teachers beyond the commonly 
adopted worksheets and end-of-unit tests, including investigation* projects, jour- 
nal* in-depth written reports and even drawing— as illustrated on pages 61-62— can 
be undertaken outside erf the classroom— in the home, school, or community— 
and this adds to their potential appeal and relevance 

Methods of assessment that are based on projects and explorations and thai are 
an extension of student's own interests and learning experiences may be viewed 
as challenging, relevant, ami even fun, rather than tedious, For example, investiga- 
tions thai involve controlled experiments can provide opportunities for the students 
to apply their understandings erf science in settings relevant to their daily lives. 
Watching a growing plant or animal can provide the opportunity to keep a daily 
record or journal, which the student can submit after several months for the teacher's 
evaluation. 

Projects for which the students must collect information from their friends, 
neighbors, and relatives or solve a school or community problem might be erf 
relevance to their own concerns and interests. For example, investigating other's 
points of view about pollution or waste disposal can help the students understand 
the ways in which such issues intertwine with science and technology and touch 
people's daily lives. Students should team to appreciate that potential solutions in- 
volve taking account erf technolopcal advances, scientific principles, and underly- 
ing po T itical complexities. If students are asked to interpret their findings at regular 
or significant intervals, then their changing interpretations provide a long-term 
perspective on their learning. 

Finally, taking assessment out of the classroom instills values that are consistent 
with the goals of science learning and with educational goals for the middle school 
as a whole. It prepares students for the larger world outside the classroom. Hie more 
closely instruction and assessment minor the processes of scientific exploration, 
inquiry, interpretation, and documentation, the more likely it is that the students 
will come to understand how scientists think, how science develops and 
technologies are created, and what their contributions are to society and to 
individuals. 
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Striking a Balance 



The primary message in this document and H predecessor report on assessment 
in elementary school science (Raizen et al ., '989) is that assessment should be in- 
tegrated with instruction. This works to the advantage of the students, When assess- 
ment is integrated with instruction, teachers can use the results in a diagnostic and 
formative way— to alter their instructional sequences and emphases. They know 
when the students are confused and they can structure appropriate {earning ex- 
periences. However, in science instruction as in the assessment of student learn- 
ing, a balance must be maintained among the various strategies used. 

The approaches described above are intended to broaden horizons and present 
assessment strategies practiced in good science classroom? However, we urge a 
variety of approaches. For example, while hands-on and laboratory work is integral 
to learning science, there will continue to be a place for reading and written work. 
Cooperative learning strategies have many strengths, but groups and individuals 
exhibit competencies in different ways, and individuals should also be assessed ac- 
cording to their own particular accomplishments. Finally, because of the reality of 
giving grades, teachers informal assessments should be accompanied by standard- 
ized procedures. In addition to the value of monitoring student activities and the 
information gained from evaluating products of long-term projects, there is still a 
place for strong, well-designed paper-and-penciJ examinations that include essay 
questions assessing cumulative learning and attainment of instructional goals, 
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Chapter VI 

Innovative Assessments: New Directions 



in the last chapter, we discussed assessment strategies designed to support good 
instruction. These strategies are available to any teacher willing to move beyond 
multiple-choice tests ami short-answer quizzes. Some educators argue, however, 
that assessments should go beyond these instructional functions, that they should 
be learning experiences for students in their own right and that tests should model 
good instruction (Sizer, 1986; Wigpns, 1989; ArchbaW and Newmann, 1988; Feuers- 
tein et al M 1967; de Lange, 1987.). in this chapter, we describe several innovations 
in assessment that are working toward these new expanded goals for assessment. 
The first two are examples from other countries, the next few are examples from 
state assessment programs in the United States, and then we describe an example 
developed at a university for the purpose of doing research in classrooms. The last 
section of this chapter discusses the potential of the computer in assessing science 
learning, based on its role in science instruction. 



Innovative Curriculum and 
Assessment from the Netherlands 

The first example of innovative assessment is taken from secondary mathematics 
in the Netherlands. The reader might wonder what relevance this has for middle- 
level science programs in the United States* in our view, although many of the sur- 
face dements are different, the problems and issues are very much the same. The 
assessment we shall describe was developed for the Ma hematics A program in 
the Netherlands, a program designed for high school students who would not be 
mathematics majors but would use mathematics as a too! in their careers (e.g., 
economics, medicine) and in their lives. Therefore, the primary focus of this cur* 
riculum would be its usefulness. In middle-level science programs in the United 
States, teachers often have very similar goals. They recognize that only a small 
proportion of their students will become scientists but that a much larger number 
should have the competence to enter science- related careers and use science in 
other aspects of their lives. Therefore, they want science to be useful. They are 
also aware of the disappointing fact that, for many students, their contact with 
chemistry and physics in the middle school will be their last formal learning ex- 
perience in these subjects. 
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As they explored the appropriateness of different strategies for assessing the ef- 
fectiveness of the Math A curriculum, de Lange and his colleagues (1987) were in- 
fluenced strongly by Gronlumf s (1968) assertion that achievement testing should 
be a learning aid. After looking at more than one hundred tests used in twelve par- 
ticipating schools, the researchers concluded that, for complex problems like 
those in the Math A curriculum, the students needed sufficient time to read, 
reflect, mathematize, and interpret results. Vfet, strict time limits were set on com- 
monly used written tests, which placed severe constraints on what could be asked 
erf students. Therefore, these tests incorporated only a limited scope of the goats of 
the Math A program* making feedback to the educational process minimal and 
defeating a major purpose of the tests— to help improve instruction. 

A recent report, Everybody Counts (Mathematical Sciences Education Board, 
1989:68) warns: 1 "tests stress lower rather than higher order thinking, emphasiz- 
ing student responses to test items rather than original expression and thinking" 
This captures de Lange's (1987:177) concerns: ''Examinations in mathematics 
which consist only of timed, written papers cannot, by their nature, assess ability 
to undertake practical and investigational work or the ability to carry out work of 
an extended nature. They cannot assess skills of mental computation or ability to 
discuss mathematics, nor, other than in very limited ways, qualities of persever- 
ance and inventiveness." He suggests that these qualities can only be assessed in 
the classroom over an extended period. He warns that tests lead teachers to em* 
phasize in their classrooms activities that are directly related to the type of ques- 
tions used in the examination " ..which means that practical and investigational 
work finds no place in day-by-day work in mathematics" in cases where timed 
tests represent the only method of assessment. A~ Everybody Counts states (p. 69): 
"What is tested is what gets taught. Tfests must measure what is most important." 

Assessment Principles Used in the Netherlands 

De Lange developed the following five principles for effective assessments* which 
serve as his criteria for judging different assessment approaches: 

1. Tests should Improve learning, A properly designed test or task should 
not only motivate students by providing than with short-term goals toward 
which to work, but also by providing them with feedback concerning their 
learning progress, 

2 Tfests should allow students to demonstrate what they know (posi- 
tive testing) ratter than what they dent know. Otherwise, students 
may lose confidence, which should be avoided at all times. 

3 Tests should operationalize the goals of the Mathematics A corri- 
cuhunu More specifically, tests must be developed that provide the freedom 
of response required for measuring certain complex outcomes. These include 
the ability "to create, to organize, to integrate, to express, and similar behav- 
iors that call lor the production and synthesis of ideas." 

4 . Test quality is not In the first place measured by the accessibility to 

/ 
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objective scoring, de Lange accepts the feet that competent, independent 
judges may score differently— but within certain limits. 

5. tests should fit Into the usual school practice. 

Alternative Assessments Used in the Netherlands 

The assessments designed by de Lange involve four different strategies that can be 
combined as appropriate: tastes that are to be accomplished in stages, tasks to be 
accomplished outside school hours, essays, and oral performance. 

The two-stage task, inspired by the ideas of VtadeBlij(as referenced in deUnge, 
1987), the two- stage task uses both short-answev and essay questions and provides 
students with two separate grades, one for each stage. The first stage, essentially 
a preparation for the second stage, consists of a traditional, time-restricted, written 
test administered to all of the students simultaneously for completion in class. The 
students are expected to answer as many questions as possible with an orientation 
toward having students find out what they don't know rather than demonstrating 
to the teacher what they do know. Most of the attention is javen to the "tower goals" 
of computation and comprehension. Scores are as objective as they can be under 
these conditions. The teacher scores the first-stage papers, and hands the tests back 
to the students with the biggest mistakes (and only those) and the scores disclosed. 

In the second stage, the student repeats the work at home using the teacher's feed- 
back. Interactions (e.g., outside advice and library research) are permitted between 
the two stages. At a designated time, perhaps three weeks later, the students turn 
in their work, and the teacher scores the teste a second time. 

The second stage follows the five principles listed above The test improves learn- 
ing; it emphasizes what students do know; it gives attention to the higher goals of 
interpretation and reflection; it uses subjective but reliable scoring; and, by allow- 
ing the students to work at home, it fits with usual school practice. 

Three findings are worth noting: 

1. There is a relatively wide spread in scores for the first stage, from very poor 
to excellent. At the second stage, the spread of scores is greatly reduced with 
more students doing well. 

2. Girls perform relatively more poorly than boys in the first stage. At the second 
stage, this difference disappears. In fact, the best results were scored by girls. 

3. Students experienced enhanced self-confidence when they were able to im- 
prove in the second stage. 

The take-home task. Following a fifty-minute written task, a sample of students 
were allowed to choose one out of five subjects to work on at home. They could either 
work alone or in pairs. 

The essay task. The students were given a newspaper article on the problem of 
overpopulation in the Republic of Indonesia. The article contained a great deal of 
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numerical information, but made no use of graphic representation. The students 
were asked to rewrite the article, making optimal use of graphic representation. This 
task called on the students to find relevant mathematics in the text, And relation- 
ships between different facts, and reflect upon different aspects of both in rewriting 
the article and developing the appropriate graphs, tables, and charts. 

The oral task. Once a standard part of examinations in the Netherlands, this type 
of assessment has become less popular recently, de Lange reinstituted it in order 
to study its effectiveness as an assessment procedure for Math A. All interviews took 
twenty minutes and involved the student, the teacher, an external independent ex- 
aminer, and an observer. The first question differed from student to student, using 
questions that seemed appropriate to the expected performance level of the student 
de Lange noted that one advantage of the oral exam over all forms of written tests 
is that one is able to find out how much relevant information a student really needs 
to start solving an assigned problem. He cites as other advantages the observation 
that, because of hints provided by the interviewer, the students "never got stuck." 
On the negative side, some students felt rushed due to time constraints, felt ner- 
vous because of the presence of officials, and felt uneasy at not being able to do ac- 
tual computations 



Some Conclusions 

For three reasons, de Lange recommends using a combination of until ned assess- 
ment strategies to assess the Math A curriculum. First, de Lange discovered that 
the correlation between the restricted-time, written test, the take- home task, and 
the oral test were low, indicating that these tests actually measure different dimen- 
sions. (This is consonant with findings by Applebee et al. [1989], on assessing dif- 
ferent dimensions of writing competence.) Second, the different testing strategies 
yielded different patterns of results for boys and girls Specifically, boys performed 
considerably better than girls on the time-restricted, written tests; on the stage-two 
and take-home tasks, boys and girls performed at more or less the same (high) level; 
and on the oral tests, boys and girls performed more or less the same and at a level 
between the time-restricted, written tests, and the take-home tasks. Third, the un- 
timed strategies more closely paralleled the goals of the Math A curriculum, which 
is strongly process oriented, focuses on higher thinking skills, and attempts to enable 
students to engage in the mathematisration process, all of which require time to 
enable students to engage in reflection and generate creative and constructive 
thought. Perhaps the most important message of de Lange's most important find- 
ingwork, however, is that assessments can be developed that are truly criterion 
referenced, where the goal is to assess achievement of a student's learning rather 
than spreading out individual test scores along a predetermined distribution curve. 



The Use of Profiling and 

Moderating Panels in Great Britain 

Great Britain currently is developing a new n; *;onal curriculum in many areas, in- 
cluding science. According to the report o» tin. Taak Group on Assessment and 
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Tfesting (Department of Education and Science and the Welsh Office, 1987), the em- 
phasis of the new assessments will be (Hi developing profiles of each student on 
a set of between four and six profile components, For each component, there are 
twelve attainment targets that have been identified, which aie identical for all grade 
levels tested (ages sewn, eleven, fourteen, and sixteen) but take into account the 
expected growth in knowledge and skills. The assessments proposed by the Tbsk 
Group are like teachers day-to-day assessments: they are directly concerned with 
what is being taught, and they are designed to reveal the quality of each pupil's per- 
formance irrespective of the performance of other pupils. 

For example for 7 year olds, and largely for 1 1 year olds, it is proposed that the 
tests will take the form of topics for children to work on. These will be designed 
so that they took like interesting pieces of work ordinarily met in class. In the 
course of doing them, children will be able to display a range of achievements 
which teachers can assess, by observing children's activity and by marking 
work— artistic, written, oral— that they produce using standard pro- 
cedufesHfeachers will be able to select taste from a "bank," choosing subjects 
and contexts suitable for the background and interests of the pupils iiv 
volvedlbecause children are more likely to do full justice to themselves in con- 
texts which are familiar and interesting to them (pp. 11-12). 

The results of these assessments can be used by teachers for their instructional 
and evaluative purposes and will also be aggregated at the school level lb ensure 
comparability, teachers will use "moderation" meetings with teachers from other 
scho is to discuss the progress of thdr groups of cW^ including consideration 
of the spread of results from the national tasks compared to the spread of results 
from their own assessments. The final responsibility for decisions about the prog- 
ress of individual pupils will rest with their teacher 

Performance Assessment in California 

Since 1988, California has been field testing for eventual incorporation into its 
state assessments programs a series of performance tasks and open-ended ques- 
tions emphasizing the key concepts and principles of science for grades six and 
twelve. Performance tasks focus on science process skills embedded in the con- 
tent areas of the life, physical, and earth sciences. The open-ended questions 
focus on engaging students actively in the use of hypotheses and the design of 
scientific investigations and processes, as well as providing opportunities to re- 
spond to societal and ethical issues related to science, 

In the spring of 3990, California will pilot a new testing program, the California 
Golden State Examinations. These are optional tests that students may elect to 
take in order to receive a special endorsement of their diplomas. In chemistry and 
biology, these examinations will include a combination of different modes. Hie 
multiple-choice questions will be conceptually based (thirty minutes): the open* 
ended questions will ask students to respond to a prompt by interpreting or enter- 
ing data on a chart or graph, drawing a picture to answer a question, or writing a 
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short analytical parajpaph (ten minutes); and performance tasks will use a laboratory 
setting (fifty minutes). 

California's Assessment Program (CAP), intended to probe student achievement 
statewide, has identified nine overlapping characteristics of a desirable assessment 
program: 

« Emphasis on production rather than discrimination, 

• Modeling good instruction Ttets should be mirrors of instruction and 
assessments should provide a learning opportunity in and of itself. 

• Focm on integration. The tasks should be multidimensional in skills assess- 
ed, multisensory in stimuli presented, aid multimodal in response formats. 

• Fewer tasks erf greater depth and breadth* The right kind of exercises 
would take considerably longer than normal multiple<hoice questions, hence 
that portion of the test {assuming also a short-answer portion) would consist 
of a relatively small number of tasks 

• Interdisciplinary learning and assessment Complex multidimensional 
tasks would cut across disciplinary lines, providing opportunities for students 
to write about science, and tell how they would solve a social problem. Instruc- 
tion and assessment would focus on large reaWife problems, such as deforesta- 
tion or hunger. 

• Communication, Exercises would demand demonstration erf how clearly the 
students could communicate learning. 

• Face validity. The tasks should be credible to the teachers, parents, and 
students 

• Learning and assessment hi group* The ability to internet, negotiate, and 
cope with different opinions to achieve common ends should be part of the 
assessment* 

• Renewed emphasis on speaking and listening. Oral examinations could 
take several forms, for example, student debates, peer problem-solving sessions, 
examinations of small groups that have (tone research together, or examina- 
tions of individual students. 

California is also experimenting with collecting portfolios of the students' work 
in mathematics, in the spring of 1989, fifty-five teachers teaching grades three and 
six were asked to collect their students' work over two or three months. Each port- 
folio inducted three or four pieces erf individual work, one report on a group proj- 
ect, and a reflective or imaginative piece that asked the students to reflect, in writing 
about the work done in mathematics dass, Subsequently, the teachers met to review 
each other's portfolios, compare criteria for assessment of their students 1 work, and 
exchange both teaching ami assessment ideas. The experiment will be repeated dur- 
ing 1989-90, with more teachers participating. 

Also in mathematics, the California State Department of Education (1989) has been 
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including open-ended questions in its most recent twelfth grade mathematics tests. 
The purpose is to provide the students an opportunity to think for theniselves, con- 
struct their own responses, demonstrate the depth of their understanding, and en- 
courage the students to solve problems in several ways. For 1987-88, the first year 
that open-ended questions were included in the test, a random sample of 2,500 
responses (500 to each of 5 questions) out of a total of 240,000 responses were 
reviewed by a special committee. Performance on a large percentage of the responses 
(weH over half for all but one of the problems) was rated as inadequate The com- 
mittee members surmised that the results indicated that students were not used to 
writing about mathematics and had little experience in reflecting on or describing 
their thought processes as they solved mathematical problems, 

Performance Assessment 
in Connecticut 

In summer 1989. the Connecticut State Department of Education received a grant 
from the National Science Foundation to work collaboratively with the Coalition 
of Essential Schools and the state departments of education in Michigan, Min- 
nesota, New York, Texas, Vermont, and Wisconsin to develop performance assess- 
ments in science and mathematics, in August, 1989, three dozen high school 
teachers from these states met and formed the Connecticut Multi-State fcrform- 
ance Assessment Collaborative Team (CoMPACT) to develop performance tasks 
that would be tried out in their classes during the 1989-90 school year. Criteria for 
developing effective tasks included the following: 

• The tasks should be based on essentia! rather than tangential as- 
pects of the curriculum. They should represent "big" ideas or significant 
themes. 

• The tasks should be authentic rather than contrived. They should 
use the processes that scientists or mathematicians use, and the outcomes of 
the tasks should be of value to students. 

• The tasks should be rich rather than superficial. They should cause 
the students to raise related questions, consider other problems, and make 
new connections. 

• The tasks should be engaging. They should be thought-provoking and 
foster persistence. 

• The tasks should require the students to be active rather than 
passive. The students should construct meaning and deepen their 
understanding as they solve complex problems. On a subset of tasks, the 
students would be encouraged to work collaboratively with other students, 

• The tasks should be integrative rather than fragmented. The 

students should be expected to bring together many separate pieces of 
knowledge in the completion of a given task. 
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Connecticut will use tasks designed to meet these critow to »ses8 the pitiless 
of students on the goals set forth i n Connecticut )s Common Core of Learning (Baron 
etaU 1989), an exposition of the state's educational expectations for its students. 
The tasks will assess students' scientific attitudes and dispositions as well as skills, 
processes, ami knowledge, Tbwaid these nuiltiple ends, the fits set of tasks to be 
developed will focus on sustained group tasks. These tasks could take anywhere 
from part of a class parted to several weeks to complete It is intended that the 
students will work together to plan and conduct investigations and solve real-world, 
multistep problems. 

The participants in the summer workshop recognized and readily acknowledged 
that effective performance tasks closely resemble effective Instructional activities, 
in fact, one of the goals of the assessment is to model good instructional tasks. 
However, the component which mates a task appropriate for use in assessment 
is the existence of accompanying scoring guides. In science, students will be scored 
on their understandings and applications of scientific knowledge and concepts, as 
well as on their scientific attitudes and dispositions, their effective employment of 
the skills ami processes of science, their ability to use scientific tools and apparatus 
safely and appropriately, their ability to work effectively as a member of a group, 
ami their ability to communicate their findings effectively. 

Hie CoMR*€T also developed draft criteria for determining whether a perform- 
ance task is appropriate for group work, TWo classes of problems seem particularly 
well suited. First are these that are too large or too time-consuming for an individual 
working alone to complete, Related to the jigsaw approach (Aronson et al., 1978), 
each student in the group would colled different data or do a different piece of 
research, thereby fostering private independence among the members, Each in- 
dividual also would have to inte#ate all of the various pieces so as to be able to con- 
tribute to the completion of the group task. This approach has been successful in 
enhancing the setfneoncepts of the students who come to see themselves as indispen-- 
sable to the work of the group and it often raises their level of performance accor- 
dingly (Aronson etal,, 1978). A second category dappro^ 
to which each individual brings only a partial understanding erf the scientific 
phenomena under consideration. Wbrking together and shari ng ideas has the poten- 
tial to deepen the students' collective and individual understanding (Cobb et al., 
in press). 

Tfeachers who believe in the value of group work often become frustrated when 
H comes to assessing the contributions of the individuals within the group. Because 
of their need to assign grades to individual students and be able to justify these 
grades* teachers need to have valid and reliable techniques available to them for 
assessing the achievement of each student. The CoMPACT is currently exploring 
different scoring approaches to group work that will alk>> . teachers tr check for the 
understanding of individual group members. This will permit the teachers to assess 
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the performance of the individuals white at the same time rewarding the efforts of 
the group. 



A Three-Stage Performance l^k 

One of the many strategies bei ng considered by the CoMPACT is the development 
of a three-stage performance task, whkrh was stimulated by de Lange s two-stage 
testing, in this model, teachers would obtain an in&ial assessment a! each student's 
knowledge and understanding at the beginning d the group task. For example, this 
initial assessment might call for a written prediction about what might happen dur- 
ing the course of an investigation, with the students asked to provide reasons or 
related descriptions and explanations erf their predictions. Or, the students might 
be asked for a preliminary design with an accompanying rationale, A second series 
of measurements of each student's understanding would be taken during the in- 
terval in which the group is working. These would be on-fine checks for understand- 
ing, accomplished through such informal means as students journals and logs, oral 
interviews, ami a paragraph turned in at the end of the class. The third stage would 
occur at the completion of the group work. Each student would be asfefxl to com- 
plete independently a near-transfer or extension task, example, something closely 
related to the knowledge and processes used in the group task *..h an appropriate 
degree of novelty. If students used the group experience to enhance their understan- 
ding of the scientific concepts and principles inherent in the task, they would be 
able to succeed on related but unfamiliar problems or tasks set in a novel context, 



Vfermont, as California, is pilot testing the use of students 1 portfolios in its 
statewide assessment of writing and mathematics in grades four and eleven and 
is considering the inclusion of portfolios in other areas, including science. Accord- 
ing to an August 1989 draft developed for teachers by the Vermont State Depart- 
ment of Education: 

Portfolios will be used to provide data in areas not reasonably addressed 
through standardized tests. The content of student portfolios in mathematics 
should reflect evidence of the ability of students to solve both routine and 
non-routine problems, in both group and individual situations There should 
be evidence of a student s ability to communicate and reason mathematical* 
ty. Portfolios should show student growth in understanding and using con- 
nections among various mathematical topics and between mathematics and 
other disciplines There should be examples of the students work in exploring 
problems and describing results using a variety of models or representations. 
Reflections on the student's own thought precedes in solving problems and 
on the feelings and attitudes of the student, as well as a self-assessment of 
strengths and areas needing improvement, should also be included. 



Portfolios in Vermont 
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The portfolio should contain a few examples of a student's best work collected 
over a period of more than one year. It might include the following: 

• A solution to a problem assigned as homework or given on a test or quiz. The 
solution should show originality or deviation from the usual procedure, not 
just a neat set of figures. Several different solutions to the same problem mild 
constitute one entry. 

• A problem made up by the student, with or without a solution, depending on 
the complexity. 

• A paper done for another subject that contained some mathematics, such as 
an analysis of data presented in a graph, particularly if the data were collected 
by the student. 

• A report of some group activity or project, with comments as to the individuals 
contribution (e.g., surveys, reviews of the use of mathematics in the media). 

• A picture made by the student of his or her work with manipulative, or two- 
or three-dimensional figures as a solution to a problem, or a description of a 
mathematical concept or situation. 

• Art work done by the student involving mathematics, such as drawn designs, 
coordinate pictures, scale drawings or maps, etc, 

• A videotape of a student or a group of students giving a presentation involv- 
ing mathematics, 

• A report on the history or application of some mathematical concept. 

• An entry or entries from the student's journal. 

During the 1989-90 school year, teachers are being recruited to participate in a 
pilot study of the usv of portfolios in order to provide their reactions, and their 
students reactions, to generating portfolios. The teachers will note the advantages 
they see, problems they encounter, uses they made or envision for the portfolios, 
and suggestions they have for further consideration. They also will keep track of 
the things they do and the time uiey spend with the portfolios, recording how often 
they review the portfolios with the students and the extent to which they use port- 
folio materials when meeting with parents. 

Using Naturally Occurring Problems 
to Assess Students' Understanding ^ 

Linn and Songer (1988) have described a research program consisting of a 
thirteen-week unit on thermodynamics, which used a jeries of microcomputer- 
based laboratory experiments with real-time data collection to collect, record, and 
instantaneously display laboratory data. The research team used successive cur- 
riculum reformulation to deepen eighth-grade students* understanding of the dif- 
ference between heat, energy, and temperature. In order to examine v;ltether the 
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students really understood the role of the different variables involved {e.g. starting 
temperature, volume, surface area, and insulation), the team developed a set of 
transfer tasks. Even though the students understood the effects of the variables in 
laboratory settings, they had considerable difficulty applying their understandings 
to a series of graded, natural-world problems. The specification of appropriate 
transfer tasks represents an innovative approach to assessment beca*ise k takes into 
conskleraikm single- and multi-variable situations and the degree of similarity bet- 
ween the naturally occurring problem and the laboratory tasks in the curriculum. 
Using problems in assessment that occur naturally in the world outskk school may 
have the great benefit of developing models for instructional activities designed to 
"leach for transfer," a term that has found many advocates but little application 
because of the lack of cogent exen plans. 



The Potential of 
Computers for Assessment 

in science classrooms, computers have been used in four major ways (see Guertin 
el aL !987:Abetes, 1989:linn, 1988), and each use has implications for assess- 
ment. First, department heads and teachers have used computers to assist them 
in classroom management activities. These include keeping inventories of 
materials, budgeting, computing students' grades, and preparing tests. Second, 
they have been used in various ways to assist instruction, including drill and prac- 
tice, tutorials, simulations, and research (e.g., databases, spreadsheet analyses of 
data, and word processing for report writing). A third category involves telecom- 
munications, with students from different physical locations nputting and shar- 
ing data on a phenomenon or common problem. Fourth, computers are used in 
microcomputer-based science laboratories as a way to collect, portray, ami ana- 
lyze real-time data. 

Recent advances in computer hardware have made possible quite sophisticated 
instructional and assessment activities, The use of the microchip has reduced 
costs and processing time while increasing computer memories. Optical laser disk 
technologes combine the power of the computer with the remarkable storage 
capabilities of laser disks. One of these disks, the CD-ROM, is only about five in- 
ches in diameter ami stores 270,000 printed pages of information. This means 
that one or two CD-ROM disks can provide an entire semester of study consisting 
of text and assessment materials; slides and movies can be computer controlled 
and integrated with the materials stored in the CD-ROM disks. Obviously, at the 
current time, it is not the technology that limits its applications to science educa- 
tion but the failure to generate software and learning opportunities to take advan- 
tage of the existing capabilities. Other impediments to widespread use involve the 
cost of computer hardware and the training necessary to enable teachers to feel 
comfortable using computers in their classrooms (Cohen, 1988). 

In this section, some possible uses of the computer for monitoring what 
students know and can do in science are described. The perspective taken is that 
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of the classroom teacher. However, the data generated ;h rough computer-based 
assessment can potentially be ajgregated for policy maters at the school, district, 
state, national or even international levels. We discuss four possibilities: Item banks, 
simulations, telecommunications, and microcomputer-based science laboratories 
(MBLs). 

First, some teachers are currently using computers to build item banks for their 
unit tests and final exams The format of these items usually is multqrfe-choice Some 
teachers keep item statistics to see how well students do on individual items over 
a period of time. In some cases, teachers even develop and print alternative forms 
of their tests for use in classrooms where students sit close together. The promise 
of this use lies in the potential for exchange and quality control of items among 
teachers, provided the items are openly available. 

Second, the use of simulations makes possible a very different kind of learning 
and assessment. Simulations can be classified into two types; whether the student 
plays an active or passive role. Rissive simulations are like teacher demonstrations 
in that the student observes the scientific phenomena. The advantages of using a 
computer are several: that it can substantially reduce— or elongate— the time it 
would take for phenomena to occur; it allows students to observe phenomena that 
would require expensive, unwieldy, unavailable, or dangerous equipment; it can 
enlarge or reduce the scale of phenomena to make them observable in the 
classroom. For example, within a part of a class period, the students can observe 
many generations of genetic offspring, ecosystems with prey and predators, or 
geologic or astronomical phenomena covering vast regions and taking thousands 
of years to occur. Mso, the light and sound emanating from the computer often pro- 
vide a more motivating learning situation for some students than the same material 
presented in textbooks or classroom lectures. Furthermore, such simulations may 
deepen the understanding of students about how things actually occur, making 
possible more complex analysis and evaluation activities. The simulations can 
become the stimuli for assessments that ask students to explain the phenomena 
and make and justify predictions about related phenomena. 

Active simulations create a hands-on environment that provides an opportunity 
for students to manipulate many of the variables involved in scientific phenomena 
and to observe their effects. For example, computer programs exist that simulate 
the operation of a nuclear power plant. Using control rods, the student can control 
the amount of heat generated by the reactor and thus the amount of steam formed 
and electricity generated In mechanics, simulations exist to teach students about 
combining gears to perform desired functions. Thinkertools is an example of an en- 
vironment created to teach students many of the more difficult and abstract con- 
cepts related to the laws of motion (Raizen et a)., 1989), In assessment contexts, the 
students can be asked to solve complex problems involving the manipulation of 
variables in these simulated environments. Different levels of abstraction, transfer, 
and application to real-world contexts can be incorporated in the assessment prob- 
lems The technology currently exists to track students' thinking by programming 
the computer to keep records of the strategies students use to try out their solutions. 
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This permits the teacher to monitor the understanding of the students— to see 
whether their problem-solving strategies are deliberately constructed or random 
trial-and-error. 

Third, telecommunications permit students from different locations to work 
together on a common problem. Sometimes this entails studying the effects of en- 
vironmental variables on natural phenomena. For example, one middte school ap- 
plication has students in different countries at different altitudes reporting the 
temperature at which water boils. A second one involves students at many points 
along a river taking water samples and reporting the data from the analyses of these 
samples to one another. In some sense, the students are functioning lite managers 
of databases (Guertin et al., 1987), allowing them to discover relationships note 
tiends> and form hypotheses. These databases often challenge the students to in- 
vestigate a topic further and report findings back to the class and to others on the 
telecommunications network, In the went, the students are motivated to use their 
textbooks, their teacheis, and other experts in their search for understanding. The 
assessment opportunities within these learning events a*e unlimited in that teachers 
can monitor virtually any combination of the students' scientific understandings 
and research and communication sLi11& (See March 1987 issue of Classroom Com- 
puter Learning)* 

Fourth, the use of sensors or probes in microcomputer-based science laboratories 
(MBLs) is particularly well suited for science education, because it allows the students 
to conduct hands-on investigations, with the computer assisting in gathering and 
presenti ng data. The students set up the apparatus and perform manipulative opera- 
tions just as in a traditional laboratory, but the data are presented in graphic form 
as they are collected in real time Hie students see the data displayed on a graph 
or table as they are collected and see relationships as they happen. According to 
Abeies (1989), there are several advantages to MBL& Data collection is less tedious, 
more accurate, precise, and efficient. And data can be gathered about phenomena 
not readily available before {e.g., reading the thermometer inside the freezer sec- 
tion of a refrigerator every three minutes for twenty- four hours). 

These motivating and engaging qualities of MBLs give the students the oppor- 
tunity to gain a broader perspective on what is taking place in an investigation and 
enable them to pay much more attention to the phenomena and the concepts be- 
ing studied rather than, es some teachers have put it, "getting lost in the data." For 
example, according to Guertin et al. (1987:6-7): 

...While the MBL product is continuously measuring temperatures during a 
cooling curve experiment, students can watch the sample rather than th. ther- 
mometers and observe that the crystallization coincides with the temperature 
plateau: Students have time to detect an effect and look for its cause. They 
are motivated to seek explanations for the relationships they observe. For ex- 
ample, should the temperature drop suddenly during an experiment, students 
would be alerted immediately and might discover that a draft had been created 
from an of ?n window. Using traditional methods, the students might not have 
detected the data anomaly until they graphed the data after the laboratory 
session. 
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Many researchers have found that reakime graphing increases the students' abili- 
ty to interpret graphs (Mokros, 1986; Linn, 1988), Guertin et al (1987) note that an 
entire experimental cycle can be completed within the attention span of the stu- 
dent because the time between data acquisition and data analysis and display is 
brief. For this reason, MBLs have great potential for use in investigative sdence with 
younger students. Faster data acquisition also allows the students flexibility to repeat 
the experiments, explore mote cases dan experiment, and do nrae experiments. 
Furthermore, many students are motivated to ask "what iT questions and change 
the parameters of the investigation. In addition, data can be collected over periods 
of time longer than the school day, extending from several hours to overnight to 
several days or weeks, finally, the students can see the same data displayed and 
printed in a variety of tabular and graphic formats, and they are able to analyze it 
at a later time. 

Once the students become familiar with the technology, they are f* ^ to ask their 
own questions, generate their own hypotheses, am) explore them in their own way. 
They control both the nature and the pace of their experimentation. They colled 
their own data and portray it in the graphic mode that seems best for their purpose. 
They are encouraged to verify, replicate and mate sense erf their data, use a variety 
of approaches, and communicate their findings to others. This represents an ex- 
tension of the opportunities inherent in the active simulations previously describ- 
ed, because the use of the data-gathering proi>es allows for the students to collect 
data in the real world* The students are free to explore their environments, 
manipulate variables, and observe outcomes. They have an opportunity and in- 
creased motivation to explore a phenomenon deeply. Tl £ new technology should 
also enhance transfer of learning and notions about the relevance erf science. 

Furthermore, scientific dispositions are likely to come to the fore in the use of 
MBLs. MBLs enable the students to fed like scientists. The students come up with 
problems and collect real data. They then portray the data professionally. They also 
have opportunities to work collaboratively, and the classroom can become - com- 
munity of inquirers. In this way, the students can develop and display many of the 
attitudes, attributes, dispositions, ami habits of scientists. 

Applications of sound and light probes exist in all branches of science, often mak- 
ing visible those processes that previously could only be read about. For example, 
body functions and reactions can be studied by viewing and analyzing heart and 
respiration rates, skin resistances and temperatures, electrocardiograms and etec- 
tromyograms. Even brain waves can be viewed and recorded. In physics, photogate 
probes are used for measuring velocity and acceleration, sonic transducers for 
measuring distances, drain gauges for measuring forte, thermistors for measuring 
temperature Chemical reactions can be studied through colorimetric or poten- 
tiometric techniques. As noted by Guertin et al. (1987), the ability of the computer 
to act in place of voltmeters, freeze-frame oscilloscopes, thermometers, pH meters, 
light meters and a host of other laboratory instruments, coupled with the fact it allows 
large quantities of data to be quickly collected, organized, stored, graphed, and 
analyzed, means that it might be viewed not as an expensive tool but rather as the 
"best buy" in town. 
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Guertin el al. (1987) note that one of the strongest arguments for the use of the 
MBL In school science classrooms is that college laboratories, research institutions, 
medkai laboratories, and industry now use digital data acquisition devices rather 
than traditional laboratory methods. Hence, experience with the MBL in school 
laboratories will prepare the students for advanced MBL activities in higher educa- 
tion and in scientific and technical occupations. We would urge, however, that 
simultaneously with such experience, students at the middle level also develop a 
strong foundation in understanding how basic measurements are made and the 
various uncertainties attached to different types of measurements, as discussed in 
chapters. 

The assessment opportunities arising from the application of MBLs are limited 
only by the teacher's time and inclination to make use of them. The records kept 
by the student can provide a rich base for assessing their operational and concep- 
tual knowledge as well as their thinking skills. Problems of some complexity and 
sophistication can be developed, which can be addressed in a classroom period (or 
over several periods) using MBL-generated data. Social skills can be assessed as 
groups conduct inquiries using the MBL Almost every strategy for assessing science 
that goes beyond paper-and- pencil, short-answer formate can benefit from the rse 
of the MBL 
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Chapter VII 

Assessments and Rriicy 



The emphasis in the preceding two chapters was mainly on improving assessments 
carried out by the classroom teacher in support of good science instruction and to 
evaluate the students* learning ami performance, in this chapter, we take up issues 
related to assessments carried out for broader policy purposes. Indeed, educational 
administrators, school board members, legislators, and other educational policy 
makers are increasingly turning to tests for information to assist them in monitor* 
ing outcomes, setting goals, allocating resources, and, most important, holding 
districts, schools, and even individual teachers accountable for the learning of their 
students. Currently, most of the tests given for educational policy purposes are en* 
tirely separate from the ones that teachers select or create to use in their own 
classrooms. They are often referred to as externally mandated tests, because they 
are administered under the aegis of some authority beyond the classroom, either 
within or outside of the educational system. Tfests used to inform policy and increase 
accountability are given on a much larger scale than classroom tests, often to all 
of the students in a district or state, to nationally representative samples, and 
sometimes even across international borders, They are given less often than 
classroom tests and generally sample the contents of a year or more of the 
curriculum. 

Externally mandated tests often focus on critical transition points in the school- 
ing process, including the middle-level years. Thus, testing during eighth grade is 
ubiquitous, occurring at the district, state, national, and international levels. Because 
they are almost aJ ways used for making comparisons among the classrooms, schools, 
districts* states, or nations tested, these externally mandated tests are carefully stan- 
dardized and must be given at a specified time regardless of the pacing of instruc- 
tion by a particular teacher. 

Both the students and teachers are likely to regard externally mandated tests as 
less important than classroom tests. The standardization and breadth of coverage 
of these tests usually mean that they are less closely tied to the curriculum than 
classroom tests, and the students are often told that their reporKard grades will not 
be affected by how well they da Many external testing programs employ elaborate 
safeguards to protect the anonymity of teachers ami schools, as well as of individual 
students. Nonetheless, these tests can have major direct and indirect effects on cur- 
riculum, instruction, and learning, and they merit close attention fay anyone con- 
cerned with science assessment. 
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High-Stakes Testing: 
Impacts for Good or ID 



lb understand the importance of externally mandated tests, consider the follow- 
ing examples; 

• A principal reviews the annual test score means for each of the schools fifth 
grade classrooms; 

• A school board compares school means on the district-wide tests; 

• Hie local newspaper publishes average scores, by school, on the state 
assessment; 

• Legislators anxiously monitor the press coverage concerning the "Wall 
Chart" put out by the Secretary of Education in Washington; 

• Hie National Assessment of Educational Progress (NAEP) introduces state- 
by-state comparisons, and both educators and political analysts ponder the 
potential for good or HI; 

• International comparisons show eighth graders in the United States mar the 
bottom of the countries tested in science ami mathematics knowledge, and 
educational leaders respond with calls for increased education funding, bet- 
ter teacher training, and more rigorous curricula. 

When policy makers or the public begin to use test scores to make comparisons 
and judgments, the scores become important in their own right, and the testing 
becomes "high stakes," At all levels of the educational system, the test scores 
become a factor as decisions are made about budgets, textbooks, curriculum 
frameworks and guidelines, and ultimately about the ways that students spend 
their time in classrooms 

Tfeachers might understand very well that "less is more"— that true scientific 
literacy and critical thinking would be helped more by a deep extended, 
multifaceted treatment da few topics than by a superficial survey of many topics. 
But the teachers who teach a few things well run the risk that their students will 
be unfamiliar with most or even all of the questions on high-stakes tests thai sam- 
ple factual knowledge on dozens of topics. Hie best textbooks for improving 
scores on such tests may be those that cram in the most content, at whatever cost 
to depth of understanding. Classroom time spent on generating questions, plann- 
ing experiments, learning to observe and ream! , and learning to discuss the gains 
and losses involved in alternative solutions to technological problems might do 
little to improve test performance. 

If educational administrators and policy makers, parents, and the public insist 
on treating high scores on tests of factual knowledge as ends in themselves, scien- 
tific literacy and critical thinking will suffer. 

Despite their possible negative effects, externally mandated tests also can be a 
force for the improvemen science learning. In the short term, even poorly 
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designed tests can focus public attention on the need for educational reform and 
increased financial support By showing is posable in the best schools or under 
the best conditions* tests can help raise the sightsof educators and policy makers 
elsewhere. Of course, much greater benefits ooutd be expected from sound, com- 
prehensive tests providing valid information about the fufl range erf intended learn- 
ing outcomes. Such tests could guide the allocation of educational resources to areas 
d greatest need and couW help m the fonn^ 

at the classroom, school, district, or state levels They could also be used In large- 
scale evaluations of alternative curricula; instructional practices, and educational 
policies. Finally, valid ami comprehensive tests could Ulustrate for teachers and 
students the kinds of outcomes and levels of attainment expected. 



Most externally mandated testing programs, even when done on a sample basis, 
involve large nu mbers of students, Therefore, it may appear prohibitively expen- 
sive to employ testing formats other than paper-and-pencil tests with multiple 
choice items In the long run, however, the costs of no/ employing a broader array 
of testing formats and response modes may be even higher. The only way to 
minimize the risks and maximize the benefits of high-stakes testing is to assess a 
full range of important learning outcomes. In middle-level science, this is likely to 
require that the students use scientific apparatus and tha* they respond to some 
kinds of questions that call for open-ended responses, rather than a selection 
among a small, fixed set of alternatives. In addition, even within the constraints of 
written, forced-choice tests, there may be room for substantial improvement in the 
range of learning outcomes measured. 



Most of the research on alternative testing formats, forced-response versus essay 
tests, for example, has shown that different kinds of tests tend to rank order 
students in about the same way. Such findings have been used as a justification for 
continued reliance on relatively inexpensive testing formats, if tests that are more 
costly to administer and the scores yield no more information than inexpensive 
forms of tests, why use them? There are two rcisons. 

first, as discussed above, testing is reactive. The educational system can change 
in response to accountability mechanisms, including tests. Ifeachers and students 
will look to the test content for messages about the forms of learning outcomes ex- 
pected, and curriculum and instruction will evolve in the direction of greater em- 
phasis on those outcomes tested. 



M easuring th e Full 
Range of Learning 



Outcomes 



Alternative Testing 
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Second, both logic and empirical research affirm that some of the most impor- 
tant outcomes of middle-level science education cannot be measured adequately 
using paper-and-pencil, multipie-choJce items. Frederiksen (1984) has investigated 
students' ability to pose plausible hypotheses to explain patterns of experimental 
findings. The format is simple: An experiment is described and its rest its are 
presented, olen with the aid of a simple graph or figure The students are then asked 
to pose as many reasonable explanations for the findings as they can, and their 
responses are rated for quality. After developing and validating this , *formulating 
hypotheses" test, Frederiksen attempted to create a roultipie<hoice test measur- 
ing the same ability. He and his associates found that the multiple<:hoice version 
of the test failed to measure the same abilities as the free-response version. Similarly, 
assessing the quality of a student's writing according to several different constructs 
of writing performance yields scores that, while correlated, appear to capture dif- 
ferent competencies. Moreover, the parameters of the tasks required h? a test con- 
strain writing performance {Applebee et a}., 1989). Even without appealing to em- 
pirical research, it is dear that the ability to design simple experiments or use scien- 
tific apparatus safely and correctly will be difficult or impossible to test fully using 
multiple-choice questions. 

A recent science assessment of fourth graders in the state of New York 
demonstrated the feasibility of large-scale testing using simple apparatus. * series 
of stations were set up in each classroom; at each station the students were to use 
a ruler, a simple pan balance, or other equipment to answer questions on a test they 
carried with them from station to station in five-minute rotations The skills being 
assessed inducted measurement, prediction based on observation, categorization, 
inference, and forming hypotheses. The students' scores were recorded only at the 
school; the school's scores were reported at the state level in terms of percentage 
of item difficulties. The assessment may have fallen short of testing all the forms 
o? scientific reasoning that might be hoped for even at the fourth grade, but it did 
yield very significant information about the science program in New \brk schools 
ami particularly about the students' limited exposure to hands-on science. The more 
ambitious plans for state assessments, including performance items ami more ex- 
tended exercises being formulated in California and Connecticut, were described 
in the preceding chapter. 

Better Multiple-choice Questions 

Multiple-choice questions are often thought of as testing no m jre than factual recall, 
but they have been used successfully to measure a much broader range of outcomes, 
On science tests, for example, * multiple-choice Stem might pose a scientific ques- 
tion and then describe several different experimental setups or procedures that might 
be used to investigate It. The correct answer is the one in which experiment *! and 
control groups differ only with respect to the matter at issue. This multiple-choice 
item format tests the important conceptual understanding Piaget termed controll- 
ing variables* which is often presented in middle-level science as a fundamental 
principle of scientific method. Multiple-Choice questions can test more complex 
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kinds of reasoning, as well. Imagine an item describing the history of a rock for- 
mation and presenting a diagram showing different geological strata identified by 
letters or numbers. Students' understanding of basic principles might be assessed 
by questions asking about the order in which the strata were laid down, where one 
might look for particular kinds of fossils, or which of several explanations accounts 
for some stratum not being horizontal. Obviously, other questions about the same 
figure might test vocabulary or other recall forms of knowledge 

These examples illustrate that multiple-choice items can be constructed to 
measure reasoning and understanding of scientific method Note, however, that such 
understandings are best tested in the context of an actual problem, in conjunction 
with recant factual knowledge When individuals reason, they reason about 
sometlttiv It follows, as we noted in chapter 111, that poor performance on such kerns 
can be due to deficient understanding of either the processes of scientific reason- 
ing called for or the factual information required to apply that reasoning in a par- 
ticular situation. (Of course, poor performance can also reflect poor motivation, 
failure to understand test instructions, poor reading ability, or other causes.) 

These examples also illustrate that multiple-choice items testing higher ordei 
thinking skills will almost always require the presentation of more elaborate stimuli 
man most questions measuring factual recall. More text, figures, charts, graphs, and 
diagrams will be required to describe the problem situation the students are asked 
to reason about. As a result, such test questions may require a higher level of reading 
ability than questions that test knowledge of facts and principles, as well as the ability 
to interpret graphs and other kinds of displays. They may also call for greater ef- 
fort, attention, and motivation on the part of the test taker. Any of these additional 
requirements might serve to lower some students' scores {that is, percent of items 
answered correctly versus expectations, depending on the conditions of testing.) 

Multiple-choice questions employing more complex stimuli will also take longer 
to answer so that fewer items can be administered in a given period of time. For 
all these reasons, reliable and valid multiple-choice tests will be more difficult to 
construct for higher order thinking skills than for factual knowledge, in addition, 
items on such tests will tend to be harder for the students to answer and harder for 
instructors to teach to, and so some stuoents and teachers might resist movenent 
in the direction of testing higher order skills, ft follows that significant knpmwnents 
in externally mandated tests, inducting rmittiple<hoice tests, are unSftety unless con- 
cerned parents, citizens, educators, and curriculum specialists insist on better tests, 
measuring a broader range of important learning outcomes. 

Information Needs c£ Decision Makers: 
Achievement and Context . 

Validity should be regarded as a property not of tests, or even of test scores, but of 
test-score interpretations. Valid interpretation of achievement scores, for indi- 
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viduais* classrooms, school or larger aggregations, alwaj^ requires supplemen- 
tai information about at least some of the many factors known to inftoera* achieve* 
ment. Tfeachers using tests In their own classrooms bring a wealth of background 
information about Individual students' earlier performance levels, interests, and 
other characteristics, as well as knowledge about their own curriculum and instruc- 
tion . This i nformation enables them to set reasonable expectations for achievement 
tevefc and evaluate the plausibility of alternative explanations for low or high scores. 
Even more important, H enables teachers to mate bettw 
what to change to improve tearni ng, for individual students or for the class as a 
wnote. 



Assessment for Improvement 

Vblid interpretation of scores on externally mandated achievement tests likewise 
requires contextual information if policy makers are to use the results to improve 
the students' achievement. For externally mandated testing programs limited to a 
single schooJ or a small school district, decisionmakers might already have access 
to sufficient information to interpret the semes appropriately. School principals 
and district personnel would need to know or find out about the curricular goals, 
textbooks and other instructional resources, and teaching practices in the 
classrooms tested; the students' performance in prior years and in other content 
areas; and something about the communities served by the different schools 
where the teste were given. Depending on the focus of the testing program and the 
particular score interpretations intended, additional information might be re* 
quired* tolicy makers might want to look in grea^r detail at the students' oppor- 
tunity to learn the facts or concepts covered in each test question . They might ask 
about the teachers' forma! training in science, or about the number erf years the 
present textbook series has been in use. Without such information, it is impossi- 
ble to say from achievement levels alone which teachers or schools are doing 
poorly ami which are doing well. One school may be doing extremely well in the 
light of its students' language backgrounds and other contextual factors, and still 
have lower scores than a mediocre school serving more advantaged learners. 
Another consideration is the match between the curriculum and the tes\ Where 
achievement is poor, contextual information is essential to determine to what ex* 
tent the content of the test corresponds to the curriculum. (Of course, good cor- 
respondence merely indicates that the test Is appropriate to the curriculum, not 
that the test— or the curriculum, for that matter— reflects good science instruc- 
tion.) These examples could easily be multiplied. As the scale erf testing programs 
increases, H becomes less likely that decision-mal *rs will have the needed con- 
textual information at hand Thus, it becomes increasingly important to collect it 
in conjunction with achievement-test data. This is most often accomplished by 
giving the students Sbackjpound questions" in conjunction with test Hems, and 
by providing separate questionnaires to be completed by their teachers and by 
knowledgeable personnel at the school level. 

Baron and Forgone (1989) discuss the kinds erf background information it is useful 
to collect in large-scaJe assessments. Discipline is required when assembling such 
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questions— Sit would be nice to know" is not a sufficient justification lor using 
precious questionnaire space and testing time They recommend that all background 
questions at the student, teacher, and building ieveis should satisfy at le& one of 
three criteria for inclusion: First, information must be collected on demographic 
factors that will be used to organic nd report the results. These indude questions 
on gender, racial and ethnic identification, socioeconomic status, and other 
demographic characteristics. This sort of information is importpnt for understand- 
ing how educational resources and classroom processes— that is, the opportunity 
to learn— as wei! as student achievement and attitudes are distributed across dif- 
ferent groups of students. Second, information should be collected concerning 
schooling factors known to influence achievement, including indicators erf classroom 
process. Finally, background questions can be included, because they assess educa- 
tional outcomes important in their own right, apart from academic achievement. 
These include questions about attitudes, beliefs, and behaviors, In an eighth-grade 
science assessment, for example, students in Connecticut responded on a scale from 
"strongly agree" to "strongly disagree" to statements such as "Careers in science 
are more appropriate for men than for women" and "My knowledge of science will 
be of little value to me in my day-to-day life" (Baron and Forgione 1989:189). As 
a further measure of attitudes, the students were also asked how many years of high 
school science they exnxted to take 

Tb learn something about instructional practices in science younger children were 
asked in separate questions if they had ever used a magnifying glass, a metric ruler, 
a thermometer, or a magnet in science and whether they had ever made a simple 
electrical circuit or an electromagnet. Eighth-graders were asked on a scale from 
"never" to *'n»re than ten times" how often they had used a triple-beam balance, 
a graduated cylinder, or a microscope, ami how often they had set up an electrical 
circuit. The eighth-grade assessment included actual use of a triple-beam balance 
to weigh an object. Responses to the experience question were strongly related to 
success on the performance task. (See also the discussion above on the more ex- 
tensive test of manipulative skills administered to all fourth graders in New York). 

Teachers and principals can be asked parallel questions about use of equipment 
and so forth, as a check on the studentsU responses. In the Connecticut Assessment 
of Educational Progress there were also specific questions about the availability of 
good science teachers, the amount budgeted specifically for consumable science 
supplies, amount budgeted specifically for purchase of new science equipment, and 
nv.nbers of microcomputers available for science instruction. Respondents were 
a, so asked to rate the seriousness of such problems as "a general belief that science 
is less important than other subjects," "out-of-date teaching materials," "lack of 
materials or equipment," "inadequate budget for science" "lack of student interest 
in science" "tack of teacher interest in science" "teachers inadequately prepared 
to teach science," "lack of support of administration," "teachers views not incor- 
porated into curricular decisions," and "lack of opportunity and/or support for in- 
service" Teachers also reported on whether science equipment was available to them 
and, if so, whether they had to share it In addition, they were asked to indicate how 
well trained they felt they were to teach science (Baron and Forgione 1989:206-210). 
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Similar kinds of questions have been asked in connection with the science 
assessments conducted by NAEP,by sev^ m^stiKlrescofKiuc^ by the National 
Center for Education Statistics (Schools and Staffing Survey, National Education 
Longitudinal Study of 1988), ami in the 1985-1986 National Survey of Science and 
Mathematics Education (Mfeiss, 1987). 

In short, we suggest that policy makers need to understand and document the 
status of some of the conditions that influence what students actually learn in school. 
According to Oakes (1989), the following three categories of variables are impor- 
tant to examine 

Access to Scientific Knowledge Availability of instructional materials, 
laboratories, computers, and equipment; teachers' qualifications and experience 
in science; scheduling (for example, departmentalized, discrete classes, or inter- 
disciplinary teams); classroom assignment practices (grouped by ability or mixed 
instructional groups) and the curriculum associated with each group; availability 
of academic support and enrichment programs (tutoring, after-school remediation, 
science fairs, field trips, museum programs); and parental involvement in science 
instruction or science activities. 

Press for Science Achievement end Participation. Opportunities for school 
wide recognition of science participation and accomplishments; curriculum and 
instructional activities focused on challenging, real-world scientific concepts and 
problems; faculty beliefs about the students' ability to learn science (for example, 
whether all students are capable of learning science); faculty emphasis on science 
as an interesting and important subject for students at the middle level; instructional 
leadership in science— the extent to which a significant person or group at the school 
advocates and supports science curriculum and instruction; and the degree to which 
noninstructionai constraints interfere with science activities. 

Professional Conditions for Science Teaching. Teachers' salaries; teachers' 
student load and class size; clerical support staff available for noninstructionai tasks; 
time available for professional, non-teaching work; time spent on collegia! goal set- 
ting, program planning, and instructional improvement; participation on the staff 
in school-wide decision making; administrative commitment and involvement in 
staff development in science; and administrative support for professional risk tak- 
ing and experimentation. 

As with student outcome measures in science, assessment of several of these im- 
portant program characteristics must in part rely on human judgment. Obviously, 
assessments of science programs cannot possibly provide the complex data resear- 
chers need in order to understand fully the relationships among program 
characteristics and science outcomes They can, however, provide useful clues to 
policy makers about strengths and problem areas. The challenge is to design 
assessments that provide the most centra) information with a parsimonious set of 
indicators. Relevant kinds of questions about programs, asked in conjunction with 
student assessments that involve (1) multiple-choice questions testing both lower 
order and more complex reasoning skills, (2) hands-on performance exercises 
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requiring actual use of science equipment, and (3) assignment of problems involv- 
ing extended and in-depth work can begin to illuminate the complex questions fee- 
ing educational decision-makers, tf large-scale assessments are to help inform pi ! <cy 
and ensure accountability, both the tests themselves and the background informa- 
tion providing a context for interpreting tfrformance must be sound, reliable, 
and comprehensive. 



Erosion of Validity 

Valid interpretation of test results will become more difficult as mandated 
assessments grow, particularly when they involve high-stakes testing. As noted, 
validity inheres not in a test itself, but in an intended test interpretation, an inference 
based on a score There may be different logical bases for such inferences, calling 
for different strafcjgies of test design and validation. Consider three examples: A col- 
lege admissions test, a typing test for applicants for a secretarial position, and an 
achievement test administered by a state or district. The warrants for using the SAT 
or similar tests to help reach college admissions decisions include both logical 
arguments from the tests content and design and empirical arguments from their 
observed correlations with college grades and other indicators of success. In con- 
trast, the typing test directly samples performances thai are a part of the work the 
person hired will be expected to da The achievement test probably would be in- 
termediate between these first two examples, lb the extent that it directly sampled 
some domain of proficiencies that the students were expected to acquire, as, for ex- 
ample, use of a thermometer or an equal-arm balance, it would be like the typing 
test, "to the extent that it was intended to show what children were likely to do or 
be capable of doing in non-test situations, its validity would have to rest on logical 
or empirical grounds-areas that need much further exploration and work in the 
case of science tests (Frederiksen, 1986). 

Erosion of validity may be said to occur when, as an indirect result of using the 
test, the warrant for the intended score-based inferences is weakened. In the case 
ofcc41egeadrressk>ns tests, coaching that concentrates on test-taking skills or practice 
with feedback in answering multiple-choice items may improve test scores without 
bringing any concomitant improvement in the complex, developed aptitudes the 
test is intended to reflect. If such coaching improves the scores of some examinees, 
the correlation between test performance and subsequent college success is likely 
to be reduced, thereby eroding the test's validity as a predictor. (Of course, a longer 
term program of coaching that focused on the underlying skills the test was intended 
to assess might improve both test performa criterion performance. That would not 
affect the test's validity.) In the case of the typing test or reading a thermometer, it 
is more difficult to imagine any kind of training that would substantially improve 
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test performance without also improving criterion performance. A work-sample test 
is highly resistant to erosion of validity. 

When policy actions that affect individual schools, administrators, teachers, or 
students are taken on the basis of assessment results, assessments become very im- 
portant. Teachers are more likely to teach to the test, and it is naive to ask them to 
avoid doing so Thus, the issue from an assessment perspective is to improve the 
quality ot such tests so as to make instruction based on their content worthwhile. 

Sinular caution are to order wifo re 
tions and educational goals. If background questions are interpreted as indicators 
of educational quality, they may be subject tomesan«eio^dyalk%ascogiAive 
questions. This may happen whenever the answers to questionnaire Hems are 
treated as ends in themselves. Based on the illustrations just given, for example, 
a well-intentioned but misguided teacher might dedde to teach the use of a magni- 
fying glass or a triple-beam balance as an isolated skill, likewise, questions about 
the number of homework or writing assignments completed may invite the pro- 
liferation of brief, meaningless assignments. Questionnaire developers should be 
sensitive to such reactive effects of background questions and, whenever possible, 
should word questions so as to discourage treating activities as ends in themselves. 

Summary. 

Policy makers need content-valid outcome assessments set in the schooling con- 
text. These assessments must minor the goals of instruction. They should sample 
the kinds of hands-on activities and extended problem assignments found in the 
best classroom science instruction and should also provide information on pro- 
gram and schooling features. 

Due to the reliability, versatility, and efficiency of multiple-choice items, such 
items are likely to continue to play a role in such assessments, but care should be 
taken that tests call for scientific reasoning and the application of scientific prin- 
ciples, not just factual recall. This is likely to require multiple-choice items with 
more complex, extensive stimuli than simple knowledge items. It is critical to 
recognize that some of the most important science learning outcomes may be 
nearly impossible to test with forced-choice items of any kind. As an instance, 
students should gain skill in formulating plausible explanations for experimental 
findings. Testing this skill may only be possible with free-response items requiring 
hand scoring. / 

Contextual information about teachers and learners, classroom resources and 
practices, as well as information about important affective learning outcomes may 
be obtained using background questions for students and separate questionnaires 
for teachers and principals, but care should be taken to discourage respondents or 
test users from treating instructional activities as ends in themselves. 
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Chapter VIII 

Recommendations 



The recommendations in this chapter are grouped around six principle which sum- 
marize our view of how student learning in science at the middle level should be 
assessed: 

1. Assessment must be challenging ami interesting. Classroom, school, and 
large-scale science assessments must reflect the educational purposes at the 
middle level and the growth and development of young adolescent. 

2. Assessment must reflect science instruction,which itself should reflect the 
goals for science learning, which in turn should reflect good science. Assess- 
ment must include both science knowledge and the laboratory, intellectual, 
and social skills crucial to the learning and doing of science. 

3. Reporting systems should reflect science assessments with fidelity. 

4. Educators involved at every level need to understand the new conception 
of assessment and carry wit relevant strategies, and their clientss and au- 
diences* need to understand the purposes and results. 

5. Improving the quality of the science program in a school or district requires 
information on context as well as on outcomes. 

6. Further knowledge and new techniques must be created so that assessments 
of science learning and performance are faithful to the goals of science educa- 
tion and to the nature of science. 

For each of the principles, we provide a brief discussion as necessary and then 
some action steps for bringing about the kinds of assessments that would support 
good science instruction in the classroom and help policy makers in their efforts 
to improve science education for young adolescents. 



Principle: Assessment must be challenging and interesting, h must 
reflect the educational purposes at the middle level and 
the growth and development of the young adolescent 
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Classroom assessinentsslK^be op p ort X in^ 

together how learning has progressed It should be recognized that students as well 
as f tcheis are important users of the information that assessments provide. 

Students will find assessments challenging and interesting it in solving the prob- 
lems posed, they discover new uses for the ideas and methods they have learned 
Performance assessments can be highly engaging and instwcth^ am! tney can test 
learning outcomes that are difficult to measure in other ways. Projects and produc- 
tions of many different kinds can serve as the basis for performance assessments. 



JBsawmi iwidblto H' 

L Assessments should include some long-term projects that involve the integra- 
tion of the knowledge, laboratory skills, and thinking and reasoning competen- 
cies tire students are expected to acquire. 

2. Some classroom assessments should call for new applications of the material 
that has been teamed. After learning about environmental adaptations, for ex- 
ample, the students can be asked to "design an animal" to survive in a specified 
environment. 

3. Even though some departmentalization is typical during the middle-school 
years, some science assessments should be integrated wimassessnients mother 
content areas. Oral and written reports can demonstrate literacy and com- 
munication skills as well as scientific understanding. Laboratory workbooks 
should demonstrate growing skill in using mathematics as well as science. 

4. When middie-tevel students work collaboratively on group projects, at least part 
of the assessment should address the quality of the group's effort. The students 
should not be asked to work cooperatively but then only be assessed 
individually. 

5. Large-scale assessments should strive to support the efforts of classroom and 
school assessments. 



Principle: Assessment must reflect scitmc* Instruction which itself 
should reflect the goals for science learning, which hi tarn 
should reflect good science. 

Discussion 

Both in this report and in the earlier one on elementary school science, we have 
stressed the need for a correspondence between good science instruction and assess- 
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ment. Without this interlinking, the growing emphasis on assessment can only serve 
to exacerbate the current poor condition of sclera* education in US, schools. 

1. Teachers should be prepared to model the integration of science knowledge, 
science laboratory skills, and science thinking skills In their instruction. 

2. Teachers should integrate assessment and instruction, gathering assessment 
data as students are engaged in classroom science activities. 

3. Teachers should clarify the goals of their instruction, making sure that their 
students are equally clear about these goals—across the course and in the con- 
text of each instructional activity. 

4. The boundaries of classroom assessment must be expanded to integrate in- 
structional goals and information, including teacher observations, oral presen- 
tations, production of computer and constructed models, drawings, and 
research efforts in and out of laboratory settings. 

5. Teachers should design science instructional and learning activities that in- 
corporate the collection of concrete evidence of learning, including models 
that the students have built, reports, laboratory logs, computer output, essays, 
and records of oral presentations. 

6. Teachers should plan science instruction and learning activities that incor- 
porate both individual and group tasks. This will provide a wide variety of 
products to assess progress, both to inform future instruction and to give 
grades. 

7. Teachers should design science instructional and learning activities that pro- 
vide students ample opportunities to assess the quality of their own work, 
including encouraging students to keep written journals of their progress 
towards the learning goals. 

& Teachers should provide opportunities for the collaborative interpretation of 
the evidence accumulated and perceived by students on their learning and 
their own judgments and records of progress. 

9. Teachers should ensure that this collaborative evaluation of student progress 
results in a product that can be shared with parents, which gives middle-level 
students the responsibility for keeping their parents informed about their prog- 
ress on a regular basis. 

10. Teachers should use assessment data to modify instruction and plan future 
activities. 

1 1 . Teachers, principals, and science supervisors should become partners, with 
principals and science specialists providing regular checks on the effectiveness 



t 102 

Chapter vw 95 



of the instruction and the progress of students. Such independent observa- 
tions of the students' teaming provide additional perspective, enhanced op- 
portunity for staff devetopment.and a way to keep principals informed about 
the school science program. 

12. Superintendents have the obligation to support teachers and principals, both 
in terms of providing the necessary resources for facilitating district goals in 
science, including appropriate staff development in assessment, end in terms 
of educating the community and local school board about the strengths of 
the new approach that integrates instruction and assessment. 



Principle: Reporting systems should reflect science assessments 
with fidelity. 

Discussion 

It is important that the messages sent to teachers and parents in assessment reports 
about what is important in science education not be antithetical or contradictory 
with the message to use instructional opportunities as assessment opportunities. 
Therefore, if there is a prescribed body of content and skills expected of the students, 
the teachers need to be able to incorporate them into their curriculum, instruction, 
and assessment. In that way, there will be a positive conespondence between the 
data collected at the classroom level and the goals of the school and the state depart- 
ment of education. 

In many cases, there is no articulated curriculum that teachers feel a need to foliow. 
and they appear to be free to develop instructional opportunities for students ac- 
cording to their own goals. Sometimes, however, a curious and unfortunate situa- 
tion occurs in which national or state tests are administered to the students in a 
school, and the test results are then used to hold the school accountable for 
knowledge and skills that were not shared with the teachers. If the tests that are ad- 
ministered do not, in fact, represent the school's goals, then the teachers and school 
authorities should not treat the test results and report them to the community as 
if that were the case. The same holds true at a state and a national level. 

Teachers and administrators cannot control what an external evaluation agency 
might do, but they should exert their influence in interpreting test results and striving 
for better assessment. For example, if a nationally normed assessment of school 
science does not match the state's or a school's goals, the message sent to parents 
and teachers should not be that the school or state has not succeeded simply because 
the test results are poor. Rather, the school or state should be able to present a con- 
vincing case of how it has succeeded on the goals it has been striving toward. 

The reporting implications that fouow from this discussion are that schools, 
districts, and states need to be clear about the goals they are assessing. Then, they 
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ought to give considerable thought to what the observrJWe indicators are (and on 
which data should be collected) that would provide evidence that their goats have 
been achieved. 



Recommendatkma 

1 . Teachers should be involved in developing strategies to gather, analyze, and 
portray assessment information that will be meaningful to parents, com- 
munities, and policymakers. 

2. State departments of education should provide technical assistance to enable 
teachers to gather, analyze, and portray data that will be meaningful to parents 
and communities. Assistance should also be provi led to help teachers and 
school officials to aggregate and report data useful for policymakers. 

3. Districts should help reachers develop alternative report cards in the form of 
profiling, as distinguished from grading. These sorts of report carrts, providing 
descriptive information about each student's strengths and weaknesses, would 
be useful tor both formative and summarJve evaluation. 

4. National and state agencies should seek ways to .aggregate data collected at 
the school level, i his may require parallel and complementary data collec- 
tion efforts, checks using standardized assessment questions or tasks, or 
"second opinions" by outside observers to ascertain the reliability and validity 
of the data collected locally. 

5. For certain core learnings, there should be a consensus effort to agree upon 
assessment strategies and reporting strategies throughout a state States will 
need to provide a technical assistance component to ensure that comparable 
procedures are used for administration 6f assessment exercises ami interpreta- 
tion and reporting of results. 



Principle: Educators involved at every level need to understand the 
new conception of assessment and carry oat relevant 
strategies, and their cfients and audiences rjeed to unders- 
tand tlie purposes and results. 

Discussion 

Higher education institutions that educate prospective teachers and are charged with 
inservice staff development, associations of principals and superintendents, 
teacher* groups, associations of parents and school board members, and educa- 
tional writers all have important parts to play in fostering improved science learn- 
ing and assessment. 
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1 . Higher education institutions should model, uqteku and grveteaeters the tools 
to reproduce the prindples of learning ^ 
Some specifics follow: 

• Appropriate emphasis (or a premium) should be placed on fostering a depth 
of understanding of scientific concepts and principles, in the teachers' own 
science preparation as well as in what they are expected to bring to their 
students. 

• Teachers should be taught and have opportunities to practice a variety of 
strategiesfor monitoring their own level of understanding via individual four- 
nals, small- and large-group discussions, and opportunities to compare their 
own thinking, through discussion and 1 further reading, with that of practic- 
ing scientists. 

• Higher education environments should model communities of inquiry in 
which teachersare encouraged to generate new questions, ask clarification 
questions, and discuss their tentative hunches and hypotheses with others— 
both with respect to science subject matter and pedagogy for teaching science. 

• Inanitions of higher educationshould foster and encourage persistence and 
the assimilation of new information and experiences by giving teachers long- 
term assignments which require revisiting the same concepts, as they will 
be expected to do with their students. They also should foster and assess the 
acquisition of the dispositions of scientists, including the stimulation of in- 
tellectual curiosity, open-mindedness, and tolerance for ambiguity. 

• Higher education classrooms should provide opportunities and rewards for 
teachers to do true investigations (in contrast to verification exercises) in 
which teachers generate and clarify the problem to be researched, develop 
a strategy for data collection, analysis and portrayal, and communicate their 
findings to their classmates, their instructors, and possibly to other audiences, 
including members of the science community. In short, prospective teachers 
should be taught and assessed as they will be expected to teach and assess 
their students. 

2. Groups of superintendents, principals, and teachers meeting alone and with 
one another must work to improve the quality of science learning and assess- 
ment. Several specific recommendations follow: 

• These three groups should discuss strategies for examining the standardized 
tests used to assess the middle leveJ science program in order to determine 
what messages they are sending about their goals in science education. 

• The same question should be asked about teacher-made tests and other in- 
formation used to determine report-card grades. What besides tests is used 
to determine report-card grades? What might the students conclude about 
the relative importance of breadth of knowledge and depth of understand- 
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ing? How are scientific skills, processes, and dispositions factored into the 
determination of grades? 

• These three groups also should reflect on strategies to be used at the school 
level to ascertain whether the school's science curriculum addresses what 
they truly believe to be important for students to learn. Some questions to 
consider include the following: Does the curriculum as it currently is be- 
ing delivered produce students who see the relevance of science in their 
lives? Does it motivate students to take more courses in the biological, 
physical , and earth sciences in high school? What percentages of students 
are taking more than the minimum number of required science credits? 
What additional skills and knowledge would be required of the teaching 
and administrative stalls in the school in order to design (earning and 
assessment activities likely to address needs identified by the above 
questions? 

3. Parents' groups and school board members should ensure that 
superintendents, principals, and teachers are free to design better learning 
and assessment opportunities for young adolescents. Superintendents, prin- 
cipals, and teachers claim that it is the parents and the school boards who 
want to know how the local school's students compare to students in other, 
similar schools or across the nation. Sometimes this perception results from 
the existence of school board policies calling for annual testing in science at 
the middle level. Parents' groups and school board members should confront 
the fad that the pressure to perform well on standardized, nonrweferenced 
tests pressures teachers to "cover thecumculum"represented in overstuffed 
textbooks rather than to provide a set of more time- consuming learning and 
assessment experiences that are aimed at a conceptual understanding of 
science. 

VMe recommend that parents' groups and school board members reevaluate 
the gods they have for science education, how these goals are to be achiev- 
ed- and how achievement of the goals will be assessed so as to preserve their 
intent. 

4. Education writers also have an important part to play in bringing about im- 
proved learning and assessment opportunities for young adolescents, If the 
problems and potential solutions described in this report were made available 
to a public considerably larger than the one likely to read this report, parents 
and school board members could become more aware of the magnitude of 
today's science education dilemmas. Education writers can help by careful- 
ly examining current testing practices and reporting on their limited 
sign'ficance so that the public will demand that educators try some different 
approaches to developing learning and assessment opportunities at the mid- 
dle level rather than using inappropriate and constraining practices. 
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Principle: Imimn^ the qnality erf Uresdence program In a scfaotd 
or district requires tofonnalloo on contest as well as on 
outcome Bi 

DfsrusefnM 

Improvement of science education at the middle level hinges on an understanding 
and tracking of the process through which student teamin ; in science as well as 
other outcomes *re produced. This kind of information is available to the teacher 
for the individual classroom, but not to policymakers at more aggregate levels, 
unless it is specifically collected. 

Recommendation* 

1. National policy makers should set the tone for assessing the context in which 
science learning takes place by highlighting national data about essential pro- 
gram characteristics; science program facilities and equipment; teacher;' 
backgrounds and qualifications; curriculum; instructional strategies; and pro- 
fessional teaching renditions in schools. 

2. State policy makers should include context assessments among the indicators 
of science program quality they use for school and district accountability or 
for triggering program improvement initiatives, 

3. State education agencies and the research community should assist in the 
development of valid and useful measures of essential science program 
characteristics and schemes for reporting the results of such assessments. 

4. State education agencies and local school district administrators should pro- 
vide technical assistance to schools as they attempt to implement measures 
of the school context in valid and reliable ways and as they begin to use the 
results of such assessments to frame improvement strategies for science 
programs. 

5. Local district administrators and school boards must work with parents and 
the community to help them understand the importance of assessing and 
reporting informMion about the context of science programs. They must show 
the community that such information can highlight problems and provide 
clues about potential solutions. They must communicate loudly that view- 
ing science test scores in the context of information about science programs 
can help communities move beyond self-congratulation or hand wringing by 
providing useful directions for school improvement in science education. 
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Principle: Further knowledge and new techniques must be created 
so that sssr simmls of srtence learning and performance 
are faithful to the goals of science education end to the 
nature of science* 

Diacaasfan 

Throughout this report, we have noted instances where knowledge and understand- 
ing are inadequate Examples drawn from earlier chapters include the extent to 
which capacity for formal operational thinking can be developed in all young 
adolescents and the science experiences and programs that enhance such develop- 
ment; measurable attitudes and behaviors that are valid proxies for future engage- 
ment with science and application of scientific thinking skills; and identification 
of policy- mutable program variables that are strongly linked to desired student out- 
comes for science education at the middle level . Further experimentation with and 
development of valid assessment techniques that are sufciently reliable for use 
in large-scale assessments is urgent. Similarly, better means for collecting relevant 
program and contextual information must be developed. These needs imply sup 
port both for basic research and for development. 

Recommendations for Research 

1 . The National Science Foundation, the United States Department of Educa- 
tion, and private foundations concerned with science education should spon- 
sor research programs designed to investigate how instruction, and what kinds 
of science activities and content teaching specifically, can help develop for- 
mal operational thinking in young adolescents with different backgrounds, 
competencies, and preceding educational experiences. 

2. Interdisciplinary teams of researchers drawn from science education, the rele- 
vant science disciplines (that is, those generally included in middle or junior 
high school science curricula), psychology, and educational measurement 
should Investigate the relationships currently posited among scientific at- 
titudes and behaviors exhibited in school (or reported on a questionnaire) and 
disposition beyond the science classroom to apply science knowledge and 
thinking skills and continue one's engagement with science. 

3. Federal agencies and private foundations supporting research in education 
should invest in finegrained longitudinal studies to establish linkages 
between science programs and teaching variables and science learning out- 
comes for different student groups. (This is in contrast to large-scale 
longitudinal studies which, perforce, have to use gross process and outcome 
variables.) What is the role of different instructional strategies (hands-on and 
laboratory work, collaborative group work, long-term projects, oral and written 
presentations, use of the microcomputer-based laboratory)? What is the role 
of the textbook, trade books, other written materials, guest appearances by 
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scientists, and science fairs? How important Is parent involvement, and how 
can it be engendered? How do the effects of these ffictois vary for girts? Beys? 
For students from different ethnic and socioeconomic groups' To what ex- 
tent can science programs, to be successful with young adolescents, ded with 
subject matter and abstractions important for science ieamir^ but Jar removed 
from their experiences and ostensible interests? All these are questions that 
need better irjfoTTnarton than is available at present when toomiididsdence 
instruction continues to be based on unverified practice and opinion. 



4. Assessment strategies consonant with the goals of science education and ex- 
emplary science in the middle grades must be developed for use both by in- 
dividual teachers and in large-scale assessments. In particular, the National 
Assessment of Educational Progress and individual states should attach to 
each science assessment they conduct and evaluate some experimental 
assessment exercises that will probe complex and important science learn- 
ing outcomes not addressable through tests using multiple-choice or other 
short-answer formats. (See, for example, National Assessment of Educational 
Progress, 1987). Experimentation should include not only the design of such 
exercises but also innovative scoring protocols and other rating methods to 
explore their feasibility and reliability. Attention also needs to be given to cost 
implications recognizing that the improvement of assessment will require In- 
vestment of additional resources or redeployment of current spending. 

5. Similar experimentation needs to proceed with respect to the measurement 
of program variables and teaching conditions. We suggest, however, that— 
unlike the experimentation with better outcome measures recommended 
above— this experimentation take place separately from the large-scale 
assessments of student learning. The reason for this separation is that these 
assessments are already very complicated and cumbersome and therefore not 
a good vehicle for the careful exploration of how best to track the 
characteristics of science programs and school conditions that have been 
shown {through the research recommended in number three above) to be 
strongly linked to student outcomes. 

6. The best of assessment strategies will fail unless supported and adopted by 
the persons ultimately responsible for the students' development in science— 
the classroom teachers. We therefore urge that preservice and inservice 
sacher education materials be developed that empower teachers to carry out 
assessments that will serve good science education in their classrooms. 
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